Format table with emojis results in misaligned table #25671 #26567

menochs1 · 2025-12-03T04:27:25Z

PR Summary

Fix Format-Table misalignment when displaying emojis by implementing proper Unicode surrogate pair handling and emoji width detection.

PR Context

Fixes #25671 (and likely duplicates #23861, #6290)

Format-Table was incorrectly calculating the display width of emojis, treating them as 1 cell wide when they actually occupy 2 cells on the terminal. This caused misalignment where columns after emoji-containing cells would have extra spacing.

The root cause was that the character width calculation in DisplayCells.CharLengthInBufferCells() did not:

Handle UTF-16 surrogate pairs (used by most emojis outside the Basic Multilingual Plane)
Recognize emoji characters in the BMP (U+2600-U+26FF range) as wide characters

This PR updates the width calculation logic in ILineOutput.cs to:

Detect and properly process surrogate pairs as single code points
Add a new CodePointLengthInBufferCells() method to handle full Unicode code points
Recognize common emoji ranges (Miscellaneous Symbols, Dingbats, etc.) as 2-cell wide
Update the GetFitLength() method to handle surrogate pairs when truncating strings

PR Checklist

PR has a meaningful title
- Use the present tense and imperative mood when describing your changes
Summarized changes
Make sure all .h, .cpp, .cs, .ps1 and .psm1 files have the correct copyright header
This PR is ready to merge. If this PR is a work in progress, please open this as a Draft Pull Request and mark it as Ready to Review when it is ready to merge.
Breaking changes
- None
- OR
- Experimental feature(s) needed
  - Experimental feature name(s):
User-facing changes
- Not Applicable
- OR
- Documentation needed
  - Issue filed: Format-Table with emojis results in misaligned table #25671
Testing - New and feature
- N/A or can only be tested interactively
- OR
- Make sure you've added a new test if existing tests do not effectively test the code changed

merged Zackbranch

menochs1 · 2025-12-03T04:34:34Z

@microsoft-github-policy-service agree

Copilot

Pull request overview

This PR fixes a long-standing Format-Table misalignment issue when displaying emoji characters by implementing proper Unicode surrogate pair handling and emoji width detection in the DisplayCells class. The changes replace reliance on an external package with native emoji width calculation logic that recognizes emojis as occupying 2 terminal cells instead of 1.

Key Changes:

Refactored Length() method to use grapheme cluster enumeration via StringInfo.ParseCombiningCharacters() instead of character-by-character iteration
Added CodePointLengthInBufferCells() and GraphemeLengthInBufferCells() helper methods to calculate display widths for full Unicode code points and grapheme clusters
Updated CharLengthInBufferCells() to recognize BMP emoji ranges (U+2600-U+26FF, U+2700-U+27BF, etc.) as wide characters
Enhanced GetFitLength() to properly handle surrogate pairs during string truncation in both forward and backward directions

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 16 comments.

File	Description
`test_emoji.cs`	Standalone test/validation file demonstrating emoji width calculations (should be moved to proper test directory or removed)
`src/System.Management.Automation/System.Management.Automation.csproj`	Removed Wcwidth.Net package dependency with comment explaining the removal
`src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs`	Core implementation: added grapheme cluster processing, surrogate pair handling, and comprehensive emoji range detection for proper width calculation

Comments suppressed due to low confidence (1)

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs:56

This assignment to si is useless, since its value is never read.

 System.Globalization.StringInfo si = new System.Globalization.StringInfo(str);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-03T04:36:24Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

+ (codePoint >= 0x1F1E6 && codePoint <= 0x1F1FF)) // Regional Indicator Symbols
+ {
+ return 2;
+ }


The GraphemeLengthInBufferCells() method doesn't handle wide CJK characters properly. It only checks for emoji ranges but ignores the existing CJK width logic in CharLengthInBufferCells(). A grapheme containing CJK characters should still be calculated as 2 cells wide, but this implementation defaults to returning 1 (line 95).

Consider falling back to CharLengthInBufferCells() for BMP characters that aren't emojis, similar to how CodePointLengthInBufferCells() does on lines 182-184.

Suggested change

}

}

// For BMP characters, use CharLengthInBufferCells to handle CJK/wide logic

if (codePoint <= 0xFFFF)

return CharLengthInBufferCells((char)codePoint);

Copilot · 2025-12-03T04:36:25Z

src/System.Management.Automation/System.Management.Automation.csproj

 <!-- the following package(s) are from the powershell org -->
 <PackageReference Include="Microsoft.Management.Infrastructure" Version="3.0.0" />
 <PackageReference Include="Microsoft.PowerShell.Native" Version="700.0.0-preview.2" />
+ <!-- Removed Wcwidth.Net package reference -->


[nitpick] The comment about removing Wcwidth.Net package reference doesn't add useful information and creates an orphaned comment line. The comment should either be removed entirely or clarified to explain why the package was removed (e.g., "Native emoji width calculation implemented to replace Wcwidth.Net dependency").

Suggested change

Copilot · 2025-12-03T04:36:25Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

+ // Use StringInfo to enumerate grapheme clusters
 int length = 0;
- for (; offset < str.Length; offset++)
+ System.Globalization.StringInfo si = new System.Globalization.StringInfo(str);
+ int[] textElementIndexes = System.Globalization.StringInfo.ParseCombiningCharacters(str);
+ for (int i = 0; i < textElementIndexes.Length; i++)
 {
- length += CharLengthInBufferCells(str[offset]);
+ if (i < offset) continue;
+ int start = textElementIndexes[i];
+ int graphemeLength = (i + 1 < textElementIndexes.Length)
+ ? textElementIndexes[i + 1] - textElementIndexes[i]
+ : str.Length - textElementIndexes[i];
+ string grapheme = str.Substring(start, graphemeLength);
+ length += GraphemeLengthInBufferCells(grapheme);
 }
-
 return length;


Performance concern: Using StringInfo.ParseCombiningCharacters() and Substring() for every call to Length() can be significantly slower than the previous character-by-character iteration, especially for long strings. ParseCombiningCharacters() allocates an array and performs multiple string scans.

For most strings (which don't contain complex grapheme clusters), this is unnecessary overhead. Consider adding a fast path that only uses the grapheme cluster approach when necessary:

// Fast path: check if string likely contains complex graphemes bool needsGraphemeProcessing = str.Any(c => char.IsSurrogate(c) || c >= 0x0300); if (!needsGraphemeProcessing) { // Use simpler character-by-character processing } else { // Use grapheme cluster processing }

Copilot · 2025-12-03T04:36:25Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

+ /// <summary>
+ /// Calculate the buffer cell length of a grapheme cluster (text element).
+ /// </summary>
+ /// <param name="grapheme">A string representing a single grapheme cluster.</param>
+ /// <returns>Number of buffer cells the grapheme needs to take.</returns>
+ protected virtual int GraphemeLengthInBufferCells(string grapheme)
+ {
+ if (string.IsNullOrEmpty(grapheme))
+ return 0;
+
+ // Check if the grapheme is an emoji (basic Unicode ranges)
+ int codePoint = char.ConvertToUtf32(grapheme, 0);
+ // Emoji ranges: Emoticons, Misc Symbols, Dingbats, Transport, etc.
+ if ((codePoint >= 0x1F600 && codePoint <= 0x1F64F) || // Emoticons
+ (codePoint >= 0x1F300 && codePoint <= 0x1F5FF) || // Misc Symbols & Pictographs
+ (codePoint >= 0x1F680 && codePoint <= 0x1F6FF) || // Transport & Map
+ (codePoint >= 0x2600 && codePoint <= 0x26FF) || // Misc symbols
+ (codePoint >= 0x2700 && codePoint <= 0x27BF) || // Dingbats
+ (codePoint >= 0x1F900 && codePoint <= 0x1F9FF) || // Supplemental Symbols & Pictographs
+ (codePoint >= 0x1FA70 && codePoint <= 0x1FAFF) || // Symbols & Pictographs Extended-A
+ (codePoint >= 0x1F1E6 && codePoint <= 0x1F1FF)) // Regional Indicator Symbols
+ {
+ return 2;
+ }
+ // Default width for other graphemes
+ return 1;
+ }


Inconsistent indentation: This method has excessive indentation (appears to be indented as if it's inside another block). It should align with other methods in the class.

Suggested change

/// <summary>

/// Calculate the buffer cell length of a grapheme cluster (text element).

/// </summary>

/// <param name="grapheme">A string representing a single grapheme cluster.</param>

/// <returns>Number of buffer cells the grapheme needs to take.</returns>

protected virtual int GraphemeLengthInBufferCells(string grapheme)

{

if (string.IsNullOrEmpty(grapheme))

return 0;

// Check if the grapheme is an emoji (basic Unicode ranges)

int codePoint = char.ConvertToUtf32(grapheme, 0);

// Emoji ranges: Emoticons, Misc Symbols, Dingbats, Transport, etc.

if ((codePoint >= 0x1F600 && codePoint <= 0x1F64F) || // Emoticons

(codePoint >= 0x1F300 && codePoint <= 0x1F5FF) || // Misc Symbols & Pictographs

(codePoint >= 0x1F680 && codePoint <= 0x1F6FF) || // Transport & Map

(codePoint >= 0x2600 && codePoint <= 0x26FF) || // Misc symbols

(codePoint >= 0x2700 && codePoint <= 0x27BF) || // Dingbats

(codePoint >= 0x1F900 && codePoint <= 0x1F9FF) || // Supplemental Symbols & Pictographs

(codePoint >= 0x1FA70 && codePoint <= 0x1FAFF) || // Symbols & Pictographs Extended-A

(codePoint >= 0x1F1E6 && codePoint <= 0x1F1FF)) // Regional Indicator Symbols

{

return 2;

}

// Default width for other graphemes

return 1;

}

/// <summary>

/// Calculate the buffer cell length of a grapheme cluster (text element).

/// </summary>

/// <param name="grapheme">A string representing a single grapheme cluster.</param>

/// <returns>Number of buffer cells the grapheme needs to take.</returns>

protected virtual int GraphemeLengthInBufferCells(string grapheme)

{

if (string.IsNullOrEmpty(grapheme))

return 0;

// Check if the grapheme is an emoji (basic Unicode ranges)

int codePoint = char.ConvertToUtf32(grapheme, 0);

// Emoji ranges: Emoticons, Misc Symbols, Dingbats, Transport, etc.

if ((codePoint >= 0x1F600 && codePoint <= 0x1F64F) || // Emoticons

(codePoint >= 0x1F300 && codePoint <= 0x1F5FF) || // Misc Symbols & Pictographs

(codePoint >= 0x1F680 && codePoint <= 0x1F6FF) || // Transport & Map

(codePoint >= 0x2600 && codePoint <= 0x26FF) || // Misc symbols

(codePoint >= 0x2700 && codePoint <= 0x27BF) || // Dingbats

(codePoint >= 0x1F900 && codePoint <= 0x1F9FF) || // Supplemental Symbols & Pictographs

(codePoint >= 0x1FA70 && codePoint <= 0x1FAFF) || // Symbols & Pictographs Extended-A

(codePoint >= 0x1F1E6 && codePoint <= 0x1F1FF)) // Regional Indicator Symbols

{

return 2;

}

// Default width for other graphemes

return 1;

}

Copilot · 2025-12-03T04:36:26Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

+ // Check if the grapheme is an emoji (basic Unicode ranges)
+ int codePoint = char.ConvertToUtf32(grapheme, 0);
+ // Emoji ranges: Emoticons, Misc Symbols, Dingbats, Transport, etc.
+ if ((codePoint >= 0x1F600 && codePoint <= 0x1F64F) || // Emoticons
+ (codePoint >= 0x1F300 && codePoint <= 0x1F5FF) || // Misc Symbols & Pictographs
+ (codePoint >= 0x1F680 && codePoint <= 0x1F6FF) || // Transport & Map
+ (codePoint >= 0x2600 && codePoint <= 0x26FF) || // Misc symbols
+ (codePoint >= 0x2700 && codePoint <= 0x27BF) || // Dingbats
+ (codePoint >= 0x1F900 && codePoint <= 0x1F9FF) || // Supplemental Symbols & Pictographs
+ (codePoint >= 0x1FA70 && codePoint <= 0x1FAFF) || // Symbols & Pictographs Extended-A
+ (codePoint >= 0x1F1E6 && codePoint <= 0x1F1FF)) // Regional Indicator Symbols
+ {
+ return 2;
+ }
+ // Default width for other graphemes
+ return 1;
+ }


The GraphemeLengthInBufferCells() method only checks the first code point of a grapheme cluster (line 81), but grapheme clusters can contain multiple code points (e.g., base character + combining marks, or emoji with skin tone modifiers). This will give incorrect results for complex graphemes.

For example, "👨‍👩‍👧" (family emoji) is a single grapheme but consists of multiple code points joined with zero-width joiners. The current implementation would only check the first code point.

Consider using a more comprehensive approach that accounts for the full grapheme structure, or leveraging existing Unicode width calculation libraries.

Suggested change

// Check if the grapheme is an emoji (basic Unicode ranges)

int codePoint = char.ConvertToUtf32(grapheme, 0);

// Emoji ranges: Emoticons, Misc Symbols, Dingbats, Transport, etc.

if ((codePoint >= 0x1F600 && codePoint <= 0x1F64F) || // Emoticons

(codePoint >= 0x1F300 && codePoint <= 0x1F5FF) || // Misc Symbols & Pictographs

(codePoint >= 0x1F680 && codePoint <= 0x1F6FF) || // Transport & Map

(codePoint >= 0x2600 && codePoint <= 0x26FF) || // Misc symbols

(codePoint >= 0x2700 && codePoint <= 0x27BF) || // Dingbats

(codePoint >= 0x1F900 && codePoint <= 0x1F9FF) || // Supplemental Symbols & Pictographs

(codePoint >= 0x1FA70 && codePoint <= 0x1FAFF) || // Symbols & Pictographs Extended-A

(codePoint >= 0x1F1E6 && codePoint <= 0x1F1FF)) // Regional Indicator Symbols

{

return 2;

}

// Default width for other graphemes

return 1;

}

int width = 0;

for (int i = 0; i < grapheme.Length;)

{

int codePoint = char.ConvertToUtf32(grapheme, i);

// Skip zero-width joiner (U+200D) and combining marks

if (codePoint == 0x200D || CharUnicodeInfo.GetUnicodeCategory(grapheme, i) == UnicodeCategory.NonSpacingMark)

{

i += char.IsSurrogatePair(grapheme, i) ? 2 : 1;

continue;

}

width += CharLengthInBufferCells(char.ConvertFromUtf32(codePoint), codePoint);

i += char.IsSurrogatePair(grapheme, i) ? 2 : 1;

}

return width;

}

Copilot · 2025-12-03T04:36:27Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

 using System.Management.Automation.Internal;
 using System.Text;

+


[nitpick] Unnecessary blank line added. This extra whitespace doesn't serve any purpose and should be removed.

Suggested change

Copilot · 2025-12-03T04:36:27Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

+ (codePoint >= 0x1F0A0 && codePoint <= 0x1F0FF) || // Playing Cards
+ (codePoint >= 0x1F100 && codePoint <= 0x1F64F) || // Enclosed Alphanumeric Supplement, Emoticons
+ (codePoint >= 0x1F680 && codePoint <= 0x1F6FF) || // Transport and Map Symbols
+ (codePoint >= 0x1F900 && codePoint <= 0x1F9FF) || // Supplemental Symbols and Pictographs


Overlapping ranges detected: The range 0x1F300-0x1F9FF on line 163 already includes 0x1F900-0x1F9FF (line 168). The duplicate check on line 168 is redundant and should be removed.

Suggested change

(codePoint >= 0x1F900 && codePoint <= 0x1F9FF) || // Supplemental Symbols and Pictographs

Copilot · 2025-12-03T04:36:28Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

+ (codePoint >= 0x1FA70 && codePoint <= 0x1FAFF) || // Symbols and Pictographs Extended-A
+ (codePoint >= 0x2600 && codePoint <= 0x26FF) || // Miscellaneous Symbols (includes some emojis)
+ (codePoint >= 0x2700 && codePoint <= 0x27BF) || // Dingbats
+ (codePoint >= 0x1F170 && codePoint <= 0x1F251) || // Enclosed Alphanumeric Supplement


Overlapping ranges detected: The range 0x1F100-0x1F64F on line 166 already includes 0x1F170-0x1F251 (line 173). The duplicate check on line 173 is redundant and should be removed.

Suggested change

(codePoint >= 0x1F170 && codePoint <= 0x1F251) || // Enclosed Alphanumeric Supplement

Copilot · 2025-12-03T04:36:28Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

+ // Use StringInfo to enumerate grapheme clusters
 int length = 0;
- for (; offset < str.Length; offset++)
+ System.Globalization.StringInfo si = new System.Globalization.StringInfo(str);


The variable si is declared but never used. It can be removed since only ParseCombiningCharacters() is needed here.

Suggested change

System.Globalization.StringInfo si = new System.Globalization.StringInfo(str);

Copilot · 2025-12-03T04:36:28Z

src/System.Management.Automation/FormatAndOutput/common/ILineOutput.cs

+ else if (!startFromHead && char.IsLowSurrogate(str[k]) && k - 1 >= kFinal && char.IsHighSurrogate(str[k - 1]))
+ {
+ // This is a surrogate pair when going backward - skip it since we'll process with the high surrogate
+ k = startFromHead ? (k + 1) : (k - 1);


Condition always evaluates to 'false'.

Suggested change

k = startFromHead ? (k + 1) : (k - 1);

k = k - 1;

microsoft-github-policy-service · 2025-12-10T04:48:48Z

This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days.
Maintainer, please provide feedback and/or mark it as Waiting on Author

ZackaryRippee and others added 5 commits December 2, 2025 21:03

multiple attempts at fixing issue: did not go over well

6d74e12

removal of packages + failing of other options

dadeb6a

Merge pull request #1 from DylanD42/zackbranch

a31aaea

merged Zackbranch

Issue fixed

604852d

Merge branch 'master' into masonsBranch

a37c04c

Copilot AI review requested due to automatic review settings December 3, 2025 04:27

menochs1 requested review from a team and jshigetomi as code owners December 3, 2025 04:27

Copilot started reviewing on behalf of menochs1 December 3, 2025 04:27 View session

Copilot finished reviewing on behalf of menochs1 December 3, 2025 04:31

Copilot AI reviewed Dec 3, 2025

View reviewed changes

microsoft-github-policy-service bot added the Review - Needed The PR is being reviewed label Dec 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Format table with emojis results in misaligned table #25671 #26567

Format table with emojis results in misaligned table #25671 #26567

menochs1 commented Dec 3, 2025

menochs1 commented Dec 3, 2025

Copilot AI left a comment

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

Copilot AI Dec 3, 2025

microsoft-github-policy-service bot commented Dec 10, 2025

Labels

3 participants

- }
+ }
+ // For BMP characters, use CharLengthInBufferCells to handle CJK/wide logic
+ if (codePoint <= 0xFFFF)
+ return CharLengthInBufferCells((char)codePoint);

	<!-- Removed Wcwidth.Net package reference -->
	<!-- Removed Wcwidth.Net package reference: native emoji width calculation implemented to replace Wcwidth.Net dependency -->

		using System.Management.Automation.Internal;
		using System.Text;

Format table with emojis results in misaligned table #25671 #26567

Are you sure you want to change the base?

Format table with emojis results in misaligned table #25671 #26567

Conversation

menochs1 commented Dec 3, 2025

PR Summary

PR Context

PR Checklist

menochs1 commented Dec 3, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

microsoft-github-policy-service bot commented Dec 10, 2025

Labels

3 participants