-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please Define bit Endianness, byte Endianness and size of base Types #23143
Comments
Just a random example of some mess that MIGHT be prevented, if the bit endianness, byte endianness and sizes of base types were defined in V specification: #23136 Someone could write one version of code in V and that would just work without needing to reimplement the same math code for every CPU type and without having to take various CPU peculiarities to account. If that universal version is considered to be "too slow", then someone else at some computing center institution, supercomputer maintenance team, can swap it out with their CPU type specific custom version, but at least for the rest of the V users the math code would just work. |
Suppose someone nostalgic want to convert V language to Java Virtual Machine code. To start with a valid V number can include undescores like fn main() {
year := 2_024
println('Hello, World ${year}')
} First approach would be to convert V code into java code, something like this: public class year {
public static void main(String[] args) {
int year = 0x07E8;
System.out.println("Hello, World " + year);
}
} And use $ javac year.java
$ java year
Hello, World 2024 Second approach would be to convert V code into something like $ javap -c year.class
Compiled from "year.java"
public class year {
public year();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: sipush 2024
3: istore_1
4: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
7: new #3 // class java/lang/StringBuilder
10: dup
11: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V
14: ldc #5 // String Hello, World
16: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: iload_1
20: invokevirtual #7 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
23: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
26: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
29: return
} the section Third approach would be to go "native" and produce the complete class content byte by byte:
and is until here we see endianess is required. Line As far I understand is the backend (here the JVM) which needs the endiannes not V itself. |
@jorgeluismireles Thank You for the answer.
Suppose someone wants to implement some fast file hashing algorithm and starts to use speed hacks like "shift one bit left" to multiply by 2, then how can that be done without knowing the byte endianness and bit endianness of a multi-byte number? Thank You. |
Just to clarify, V lang has compile time Line 55 in 939d243
// hton16 converts the 16 bit value `host` to the net format (htons)
pub fn hton16(host u16) u16 {
$if little_endian {
return reverse_bytes_u16(host)
} $else {
return host
}
}
...
// reverse_bytes_u16 reverse a u16's byte order
@[inline]
pub fn reverse_bytes_u16(a u16) u16 {
// vfmt off
return ((a >> 8) & 0x00FF) |
((a << 8) & 0xFF00)
// vfmt on
} And helping methods: import encoding.binary
fn main() {
million := u64(1_000_000)
println('le million: ${binary.little_endian_get_u64(million)}')
println('bg million: ${binary.big_endian_get_u64(million)}')
}
Write your code or pseudo code of your ideas to understand your problem. |
I think that the compile-time endianness detection will do. |
Describe the feature
Often times the abstraction level of software that is written in C-like programming languages is neither numbers nor characters, but bit-streams. Examples: SHA256, UTF8. The UTF8 even has an optional Byte-Order-Mark (BOM). Code of such software is bit endianness specific, byte endianness specific and base type size specific. In the C programming language world there tends to be an assumption that if the software has been written for "big computers" like laptops, desktops, servers, then the sizes of certain C base types like char, int, double and bit endianness and byte endianness of those types match with what they happen to be with x86/AMD64 CPUs, but from software stability point of view it would be better, if those properties of base types were EXPLICITLY DEFINED at programming language specification and guaranteed by programming language implementation.
Thank You for reading my comment.
Use Case
Use cases:
Proposed Solution
For every base type define byte endianness, bit endianness and size in bytes. It's OK for them to match with what the classical x86/AMD64 CPU has, but it has to be defined without ambiguity and INDEPENDENT OF CPU type. That is to say, if someone creates a V compiler for some new experimental CPU, then bitstream algorithm implementations in V that work on x86/AMD64 should work WITHOUT ANY MODIFICATION on that new experimental CPU even, if bit endianness and byte endianness of that CPU differs from that of the x86/AMD64.
Other Information
Not having defined bit endianness, byte endianness and size of base types introduces an "undefined behaviour" of sorts, where people just assume that the behaviour is like it is on x86/AMD64, but it's not guaranteed to be like that.
Acknowledgements
Version used
git version 2.39.2
Environment details (OS name and version, etc.)
Linux, AMD64 and the various CPUs that the various Raspberry Pi-s come with.
Note
You can use the 👍 reaction to increase the issue's priority for developers.
Please note that only the 👍 reaction to the issue itself counts as a vote.
Other reactions and those to comments will not be taken into account.
Huly®: V_0.6-21580
The text was updated successfully, but these errors were encountered: