-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVM IR] Add support for Unicode strings #9764
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 5 out of 10 changed files in this pull request and generated 1 comment.
Files not reviewed (5)
- src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrModule.cs: Evaluated as low risk
- src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrStringManager.cs: Evaluated as low risk
- src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrVariable.cs: Evaluated as low risk
- src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/MemberInfoUtilities.cs: Evaluated as low risk
- src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrComposer.cs: Evaluated as low risk
Comments suppressed due to low confidence (3)
src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/StringHolder.cs:21
- This line will throw a NullReferenceException if value is null. It should return a new StringHolder with null data instead.
if (value == null) { return new StringHolder ((string)value); }
src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/StringHolder.cs:5
- The behavior introduced in the StringHolder class should be covered by tests to ensure that the comparison and equality operations work as expected.
class StringHolder : IComparable, IComparable<StringHolder>, IEquatable<StringHolder>
src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/NativeAssemblerAttribute.cs:41
- The phrase 'an UTF16' should be corrected to 'a UTF16'.
/// the string is output as an UTF16 in
src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrInstructions.cs
Show resolved
Hide resolved
@@ -299,7 +304,7 @@ public void Add (LlvmIrGlobalVariable variable) | |||
EnsureValidGlobalVariableType (variable); | |||
|
|||
if (IsStringVariable (variable)) { | |||
AddStringGlobalVariable (variable); | |||
AddStringGlobalVariable ((LlvmIrStringVariable)variable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cast is breaking the world, e.g. BuildApplicationAndClean(True, "apk"):
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: System.InvalidCastException: Unable to cast object of type 'Xamarin.Android.Tasks.LLVMIR.LlvmIrGlobalVariable' to type 'Xamarin.Android.Tasks.LLVMIR.LlvmIrStringVariable'.
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: at Xamarin.Android.Tasks.LLVMIR.LlvmIrModule.Add(LlvmIrGlobalVariable variable) in /Users/builder/azdo/_work/4/s/xamarin-android/src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrModule.cs:line 307
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: at Xamarin.Android.Tasks.LLVMIR.LlvmIrModule.AddGlobalVariable(Type type, String name, Object value, LlvmIrVariableOptions options, String comment) in /Users/builder/azdo/_work/4/s/xamarin-android/src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrModule.cs:line 274
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: at Xamarin.Android.Tasks.LLVMIR.LlvmIrModule.AddGlobalVariable(String name, Object value, LlvmIrVariableOptions options, String comment) in /Users/builder/azdo/_work/4/s/xamarin-android/src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrModule.cs:line 262
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: at Xamarin.Android.Tasks.ApplicationConfigNativeAssemblyGenerator.Construct(LlvmIrModule module) in /Users/builder/azdo/_work/4/s/xamarin-android/src/Xamarin.Android.Build.Tasks/Utilities/ApplicationConfigNativeAssemblyGenerator.cs:line 209
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: at Xamarin.Android.Tasks.LLVMIR.LlvmIrComposer.Construct() in /Users/builder/azdo/_work/4/s/xamarin-android/src/Xamarin.Android.Build.Tasks/Utilities/LlvmIrGenerator/LlvmIrComposer.cs:line 27
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: at Xamarin.Android.Tasks.GeneratePackageManagerJava.AddEnvironment() in /Users/builder/azdo/_work/4/s/xamarin-android/src/Xamarin.Android.Build.Tasks/Tasks/GeneratePackageManagerJava.cs:line 344
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: at Xamarin.Android.Tasks.GeneratePackageManagerJava.RunTask() in /Users/builder/azdo/_work/4/s/xamarin-android/src/Xamarin.Android.Build.Tasks/Tasks/GeneratePackageManagerJava.cs:line 128
…\tools\Xamarin.Android.Common.targets(1741,3): error XAGPM7007: at Microsoft.Android.Build.Tasks.AndroidTask.Execute() in /Users/builder/azdo/_work/4/s/xamarin-android/external/xamarin-android-tools/src/Microsoft.Android.Build.BaseTasks/AndroidTask.cs:line 25
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have to bring over some more changes from #9572, haven't had time for this today. Will update on Monday.
3cba69d
to
1b44c07
Compare
89e6295
to
1869215
Compare
@@ -203,6 +203,8 @@ public ApplicationConfigNativeAssemblyGenerator (IDictionary<string, string> env | |||
|
|||
protected override void Construct (LlvmIrModule module) | |||
{ | |||
module.DefaultStringGroup = "env"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a "default string group"? Anything to link to in the commit message? (Is this why .apkdesc
files needed updating?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an internal implementation detail, it makes the string manager put strings not assigned to a particular group in this one. It's done this way because all LLVM IR source files are generated separately and we can't be sure that (local) string constant names (e.g. the former-default .str.X
where X is a per-module counter) wouldn't clash between the different generated files. So this is a way to "namespace" strings. If we were able to keep a shared state between all LLVM IR generators, it wouldn't be necessary.
enum LlvmIrStringEncoding | ||
{ | ||
UTF8, | ||
Unicode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silly Q, but should we use UTF16LE
instead of Unicode
, seeing as "Unicode" doesn't necessarily mean anything…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what the managed land calls the encoding, so I followed suit.
return stringVar; | ||
} | ||
|
||
LlvmIrStringGroup? group; | ||
string groupPrefix; | ||
if (String.IsNullOrEmpty (groupName) || String.Compare ("str", groupName, StringComparison.Ordinal) == 0) { | ||
group = defaultGroup; | ||
groupPrefix = ".str"; | ||
groupPrefix = $".{defaultGroupName}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is where the "default string group" comes in?
What does this do to the resulting LLVM-IR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It allows us to "namespace" strings, as described above - their names mustn't clash between modules. It may result in some change of the final .so size, but it shouldn't (there's a small chance we will have duplicate strings in different .ll files)
|
||
namespace Xamarin.Android.Tasks.LLVMIR; | ||
|
||
class StringHolder : IComparable, IComparable<StringHolder>, IEquatable<StringHolder> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason not to use a record
? You'll get IEquatable<T>
for free!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason, but I don't want to change it anymore :)
return holder; | ||
} | ||
|
||
public int CompareTo (object obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that CompareTo(StringHolder?)
already has a null check, this could be simplified to:
public int CompareTo (object obj) => CompareTo (obj as StringHolder);
return hc ^ Encoding.GetHashCode (); | ||
} | ||
|
||
public override bool Equals (object obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as with CompareTo()
, this can be simplified to:
public override bool Equals (object obj) => Equals (obj as StringHolder);
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
An example of string "namespacing" (both from the same build):
All the above strings use the default string group for the given generator/module. If any of them were symbols with scope other than the given translation unit, their names wouldn't clash. |
Since all the files are generated from separate tasks, we don't have a global LLVM IR state which can keep track of strings and ensure that there are no duplicate symbol names. To prevent potential clashes, each generator now sets the default string group name which is unique for each module. In the future, we should try to manage strings globally (which would also result in more de-duplication)
f090c31
to
5f9205f
Compare
Another portion of code taken from #9572 to make it smaller.
This adds support for Unicode strings to the LLVM IR generator, together with de-duplication and
support for outputting the same string encoded both as UTF-8 and Unicode.