Go Back   Cockos Incorporated Forums > REAPER Forums > REAPER Bug Reports

Reply
 
Thread Tools
Old 08-31-2023, 02:38 PM   #1
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default [Mac / JSFX] The "%" operator gives different results on Mac Intel and Mac M1

This JSFX executes two equivalent simple algorithms and displays the result.
Code:
desc:JJ-Test-JSFX-03

@init
Rm = 256 * 256 * 256 * 256;
Ra = 4 * 8191 + 1;
Rc = 8167;

Re = 441311642;

// Algorithm 1
Rn = 0;
loop (10,
  u = Rn * Ra + Rc;
  Rn = u % Rm;
);

// Algorithm 2
Rp = 0;
loop (10,
  u = Rp * Ra + Rc;
  Rp = u - Rm * floor(u / Rm);
);

sprintf(#dbg_desc, "Expected : %.0f   Got: %.0f / %.0f", Re, Rn, Rp);
The first algorithm uses "%" to compute "A%B", while the second uses "A-B*floor(A/B)" for the same operation. On a Mac intel, the (correct) result is:
Code:
Expected : 441311642   Got: 441311642 / 441311642
while on a Mac M1 it is:
Code:
Expected : 441311642   Got: 8167 / 441311642
I guess the problem comes from the implementation of "%" on the M1. Is it possible to correct this on the M1 version ?

Thanks.
jack461 is offline   Reply With Quote
Old 08-31-2023, 11:05 PM   #2
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

This is weird !
Can an example without using loops be found?
mschnell is offline   Reply With Quote
Old 09-01-2023, 12:44 AM   #3
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default

The problem was found in a pseudo random generator that has been working for years.
Code:
/*
      Random number generator
      Hull–Dobell Theorem :
      Linear congruential generator "n <-- (n * a + c) % m" is correct when:
        m and c are relatively prime,
        a − 1 is divisible by all prime factors of m,
        a − 1 is divisible by 4 if m is divisible by 4.
*/
MaxDouble = 1.7976931348623158  * 10^308;
MinDouble = -1.7976931348623158  * 10^308;
Rndm_m = 256 * 256 * 256 * 256 ; // Modulo: 2^32
Rndm_m_1 = Rndm_m - 1;
Rndm_a = 4 * 8191 + 1;  // 8191 is a prime
Rndm_c = 8167; // a prime, less than 8191
Rndm_n = 59933; // the seed : ANY value will do
// irand(k) provides a random integer in [0 .. k-1]
// irand() provides a random integer in [0 .. 2^32-1]
function irand(k)
local (u)
(
    u = Rndm_n * Rndm_a + Rndm_c;
    Rndm_n = u - Rndm_m * floor(u/Rndm_m);
    // Rndm_n = (Rndm_n * Rndm_a + Rndm_c) % Rndm_m;
    (k <= 0) ? Rndm_n : floor(k * Rndm_n / Rndm_m);
);
I think the error occurs on the second or third call of the function, as soon as "Rndm_n * Rndm_a + Rndm_c" becomes greater than 2^32. However, I have no Mac M1, and I have been working by phone for a long time with a beta tester who owns one, just to spot a single call of "irand" in a 80000 line project. So I can't give you a simpler expression that shows the problem, but it should be easy to do if you have a Mac M1.

For now, in the "irand" function, the first two lines replace the third one, that is commented, so my problem is temporary solved. Of course, it would be much better to have the problem corrected in the implementation !
jack461 is offline   Reply With Quote
Old 09-01-2023, 07:19 AM   #4
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by jack461 View Post
I think the error occurs on the second or third call of the function
Even more weird.
Hopefully Justin kicks in....
To help him a very simple code snipped (no loop, no function calls) would be perfect.
mschnell is offline   Reply With Quote
Old 09-01-2023, 11:05 AM   #5
Fabian
Human being with feelings
 
Fabian's Avatar
 
Join Date: Sep 2008
Location: Sweden
Posts: 7,591
Default

Yeah, this is so weird... that on M1, when printed out Rn has exactly the value of Rc. Rn is assigned this value (8167) on the first iteration. Is it clear that the loop actually executes more than once? Maybe print out the intermediate values?
__________________
// MVHMF
I never always did the right thing, but all I did wasn't wrong...
Fabian is offline   Reply With Quote
Old 09-01-2023, 01:18 PM   #6
Justin
Administrator
 
Justin's Avatar
 
Join Date: Jan 2005
Location: NYC
Posts: 16,223
Default

ah hmm yeah so % operates on 64-bit integers for x86-64 but 32-bit integers on arm/aarch64/x86/ppc hmm what to do what to do...

at any rate, if your values are going to be larger than 2 billion or so, the x-floor(x/b)*b way is probably the way to go!

Last edited by Justin; 09-01-2023 at 01:58 PM.
Justin is offline   Reply With Quote
Old 09-01-2023, 03:13 PM   #7
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default

Quote:
Originally Posted by Fabian View Post
Yeah, this is so weird... that on M1, when printed out Rn has exactly the value of Rc. Rn is assigned this value (8167) on the first iteration. Is it clear that the loop actually executes more than once? Maybe print out the intermediate values?
Before writing the JSFX of my first message, I ran a loop calling 100 times my irand(10000) function. The result is printed as 10 numbers on each line (read lines from bottom to top, it starts with 0, 623, etc.). A picture of the output is attached for both machines. On the M1, because of the implementation of "%", the algorithm loops on "0 623 0". On the intel, the algorithm generates the expected pseudo-random numbers between 0 and 9999.

Now it's bad news that the "%" on M1 is limited to 32 bits, because I have literally thousands of uses of "%" to check in my soft :-(
Attached Images
File Type: png Intel-irand.png (24.5 KB, 141 views)
File Type: png M1-irand.png (25.9 KB, 140 views)
jack461 is offline   Reply With Quote
Old 09-01-2023, 03:26 PM   #8
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by Justin View Post
ah hmm yeah so % operates on 64-bit integers for x86-64 but 32-bit integers on arm/aarch64/x86/ppc hmm what to do what to do...
That means the same thing would happen in standard C, or even in ASM ?
This gets weirder every day.....
mschnell is offline   Reply With Quote
Old 09-01-2023, 03:32 PM   #9
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by jack461 View Post
Now it's bad news that the "%" on M1 is limited to 32 bits, because I have literally thousands of uses of "%" to check in my soft :-(
The JSFX specs say
"numerator % denominator -- divides two values as integers and returns the remainder."
It does not state the bit depth of those integers.

They also say:

  • value << shift_amt -- converts both values to 32 bit integers, bitwise left shifts the first value by the second. Note that shifts by more than 32 or less than 0 produce undefined results. -- REAPER 4.111+
  • value >> shift_amt -- converts both values to 32 bit integers, bitwise right shifts the first value by the second, with sign-extension (negative values of y produce non-positive results). Note that shifts by more than 32 or less than 0 produce undefined results. -- REAPER 4.111+
  • a & b -- converts both values to integer, and returns bitwise AND of values.
  • a ~ b -- converts both values to 32 bit integers, bitwise XOR the values.

So you might conclude that 32 Bit integer is what is meant by Integer. Which does makes sense, ashat converting a 64 Bit integer to a 64 Bit float will reduce accuracy. you could specify 48 bit accuracy, but this is a very uncommon format.

Last edited by mschnell; 09-02-2023 at 11:01 AM.
mschnell is offline   Reply With Quote
Old 09-02-2023, 10:49 AM   #10
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default

Quote:
Originally Posted by mschnell View Post
So you might conclude that 32 Bit integer is what is meant by Integer.
Rather than reaching such a conclusion from incomplete specifications, I made a lot of tests to understand the JSFX vision of integer arithmetics. The result is presented in a previous post. I concluded that the operators "%", "|", "&" and "~" worked correctly in the range [-2^53 2^53].

Now, it is interesting (in some bad way) to discover that we can't rely on previous JSFX features...
jack461 is offline   Reply With Quote
Old 09-02-2023, 10:58 AM   #11
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

I also did a JSFX that stores multiple characters in a gmem cell and I tested how many bits I can retrieve. it might be possible that that does not work on M1, either.

Last edited by mschnell; 09-02-2023 at 02:45 PM.
mschnell is offline   Reply With Quote
Old 09-02-2023, 12:35 PM   #12
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default

Quote:
Originally Posted by Justin View Post
ah hmm yeah so % operates on 64-bit integers for x86-64 but 32-bit integers on arm/aarch64/x86/ppc
I don't know the specifications of the M1 processor, but I can't believe there is no modulo on the 64 bit integers ! It seems that you can declare int64 and int128 integers (signed or unsigned) and have all operators like &, |, ^ (xor in C), >>, << etc. operate correctly on them.
jack461 is offline   Reply With Quote
Old 09-02-2023, 02:48 PM   #13
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by jack461 View Post
I don't know the specifications of the M1 processor, but I can't believe there is no modulo on the 64 bit integers !
e.g. -> https://developer.arm.com/documentat...signed-Divide-
or -> https://stackoverflow.com/questions/...64-instruction
It seems to be able to do 64 bit division and 128 bit multiplication, but no Modulo in Hardware at all.
That should allow for calculating 64 bit modulo by a sequence of operations.

Last edited by mschnell; 09-03-2023 at 12:51 AM.
mschnell is offline   Reply With Quote
Old 09-03-2023, 02:29 AM   #14
Tale
Human being with feelings
 
Tale's Avatar
 
Join Date: Jul 2008
Location: The Netherlands
Posts: 3,745
Default

Quote:
Originally Posted by Justin View Post
ah hmm yeah so % operates on 64-bit integers for x86-64 but 32-bit integers on arm/aarch64/x86/ppc hmm what to do what to do...
Add this to the documentation?
Tale is offline   Reply With Quote
Old 09-03-2023, 02:44 AM   #15
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by Tale View Post
Add this to the documentation?
Hmmm. Documenting Arch dependency seems like a move in the wrong direction for a compiler.

The would-be integer operations in ELL should be documented in a clear way (e.g. minimum bit count for each). ( In fact as 64 Bits can't be stored in 64 bit float, something like 48 might be appropriate,)

And of course the then well defined behavior is expected on any arch.

As said I in fact did a trial end error implementation in some of my plugins, but of course it's a major risk to rely non not documented features
mschnell is offline   Reply With Quote
Old 09-03-2023, 04:25 AM   #16
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default

@justin

Quote:
Originally Posted by Justin View Post
ah hmm yeah so % operates on 64-bit integers for x86-64 but 32-bit integers on arm/aarch64/x86/ppc hmm what to do what to do...
Thanks to mschnell who dug the web to find the appropriate references, I think the "A%B" could be implemented this way (lets say we are interested only in the continuous set of integers in the [0 2^53-1] range, which is correctly represented on 64 bits floats):

First we need to convert A and B to abs(A) and abs(B). The reason is that the current implementation of "A%B" for intel returns always a positive answer, whatever the signs of A and B are. Since A and B are represented as double precision floats, it is enough to clear the sign bit of their representations.

Next, we convert A and B to 64 bit unsigned integers (we are still interested in the limited [0 2^53-1] range). Lets call X1 and X2 the registers in which we put these values, converted using the FCVTZS op code.

Then, as pointed by mschnell, "UDIV X3,X1,X2" can be used to provide "floor(A/B)" as a 64 bit integer register in X3.

Then "MUL X4,X2,X3" provides in X4 the value B*floor(A/B). We multiply two 64 bit ints, but we only care in the 64 low bits of the result, which MUL provides.

Then "SUB X5,X1,X4" computes A-B*floor(A/B), as a 64 bits int value.

Finally, "UCVTF X6,X5" provides "A-B*floor(A/B)", i.e. "A%B", converting X5 to a double precision float as used in eel2. We probably don't care about the rounding options of the operation, since we know that the result is already an integer - and then again, we limit our ambition to the [0 2^53-1] range.

Now, this is just a very naive sketch, and I may be totally wrong (actually, I just had a very first look at the arm architecture and opcodes about an hour ago), but I think it is really worthwhile to maintain the ascending compatibility with the intel version.

Best regards !
jack461 is offline   Reply With Quote
Old 09-03-2023, 10:56 PM   #17
Tale
Human being with feelings
 
Tale's Avatar
 
Join Date: Jul 2008
Location: The Netherlands
Posts: 3,745
Default

Quote:
Originally Posted by jack461 View Post
Then, as pointed by mschnell, "UDIV X3,X1,X2" can be used to provide "floor(A/B)" as a 64 bit integer register in X3.
If this is supported by all Arm targets, then cool! I think it would be good to fix this then, even though I guess it might break existing code (that might depend on the 32-bit wrap around).

That probably does leave x86 (32-bit), but maybe it's better to keep that "as is".
Tale is offline   Reply With Quote
Old 09-03-2023, 11:22 PM   #18
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by jack461 View Post
[0 2^53-1!
In fact +/- this range
mschnell is offline   Reply With Quote
Old 09-03-2023, 11:25 PM   #19
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by Tale View Post
If this is supported by all Arm targets, then cool! I think it would be good to fix this then, even though I guess it might break existing code (that might depend on the 32-bit wrap around).
That probably does leave x86 (32-bit), but maybe it's better to keep that "as is".
I suppose the only decent way is doing that (52 or 53 Bit integer arithmetic) in ASM....
Of course Intel 32 Bit can do this, too, using multiple hardware instructions. C compilers should support 64 Bit integer arithmetic in a proven way.

Last edited by mschnell; 09-03-2023 at 11:31 PM.
mschnell is offline   Reply With Quote
Old 09-04-2023, 12:59 AM   #20
Tale
Human being with feelings
 
Tale's Avatar
 
Join Date: Jul 2008
Location: The Netherlands
Posts: 3,745
Default

Quote:
Originally Posted by mschnell View Post
I suppose the only decent way is doing that (52 or 53 Bit integer arithmetic) in ASM....
Of course Intel 32 Bit can do this, too, using multiple hardware instructions. C compilers should support 64 Bit integer arithmetic in a proven way.
Sure, you could fix this for x86 as well, but again, this could break existing code (not just JSFX, but any EEL2, possibly also 3rd party). I wouldn't mind this, but because REAPER/EEL2 x86 probably isn't used that much anymore, it might not be worth the trouble (IMHO, YMMV).

Also note that (I assume that) ReaJS probably won't be patched, so depending on which JSFX targets you want to support, you might still have to deal with this issue.
Tale is offline   Reply With Quote
Old 09-04-2023, 07:31 AM   #21
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by Tale View Post
this could break existing code
Yep that can't be avoided, as for certain operations some code might rely on integer size 32 Bit (assuming such documentation, tested on 32 bit Intel) and other might rely on integer size 53 Bit (tested on 64 Bit intel).
mschnell is offline   Reply With Quote
Old 09-04-2023, 02:34 PM   #22
Justin
Administrator
 
Justin's Avatar
 
Join Date: Jan 2005
Location: NYC
Posts: 16,223
Default

We could update the arm64 EEL2 to use 64-bit integers for modulus. But I still think it's better to write code that does not depend on it (for x86 and arm32 compat). The documentation doesn't specify the integer size for a reason!

Edit: heh there's another issue with large values and modulus on x86_64:
Code:
x = (2^40) % 3;
vs
Code:
v = 2^40;
x = v % 3;
(the optimizer internally converts to 64 bit for the first case, heh)
Justin is offline   Reply With Quote
Old 09-04-2023, 11:22 PM   #23
Tale
Human being with feelings
 
Tale's Avatar
 
Join Date: Jul 2008
Location: The Netherlands
Posts: 3,745
Default

Quote:
Originally Posted by Justin View Post
But I still think it's better to write code that does not depend on it (for x86 and arm32 compat).
Agreed!

Quote:
Originally Posted by Justin View Post
The documentation doesn't specify the integer size for a reason!
Here I disagree... IMHO you should add this to the documentation, because else people will just fill in the blanks using wishful thinking.
Tale is offline   Reply With Quote
Old 09-04-2023, 11:27 PM   #24
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Quote:
Originally Posted by Tale View Post
people will just fill in the blanks using wishful thinking.
I did this myself

I seem to remember that I did not use modulo but just normal + - * /, and for integer conversion & and | .
But I only tested on x86/674.
mschnell is offline   Reply With Quote
Old 09-04-2023, 11:37 PM   #25
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default

@justin:

Quote:
Originally Posted by mschnell View Post
Yep that can't be avoided, as for certain operations some code might rely on integer size 32 Bit (assuming such documentation, tested on 32 bit Intel) and other might rely on integer size 53 Bit (tested on 64 Bit intel).
Could you give a good exemple of an existing JSFX relying on an integer size of 32 bits for a "%" operation, that would break if the "%" operation works up to 53 bit integers?

As we say, "qui peut le plus peut le moins" -- even if its seems a luxury to provide these 53 bit integers operations, I think it is a big plus for the language and its users ! Remember, we have a single data type in eel2, lets use it the best as we can, and don't add further limitations.

Last edited by jack461; 09-05-2023 at 10:23 AM.
jack461 is offline   Reply With Quote
Old 09-05-2023, 03:52 AM   #26
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Might e.g. happen if you calculate the mod divisor by some float calculations and then rely on the float of the mod result to have a limited size.
No idea what some programmer's intuition might come up with ....
But of course such cases are rare, and hence providing all would-be integer operations with 53 bit plus sign might be the least breaking change for a fully documented version.

Last edited by mschnell; 09-05-2023 at 04:13 AM.
mschnell is offline   Reply With Quote
Old 09-05-2023, 04:30 AM   #27
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Even exponents should result in modulo = 1
Odd exponents should result in modulo = 2
Code:
// on 64 Bit Intel

x41 = (2^40) % 3;  // = 1: correct
v4 = 2^40;
x42 = v % 3;       // = 2


v4r = v4 / 3;
v4s = v4r | 0;
v4t = v4s * 3;
x4z = v4 - v4t;    // = 1


x51 = (2^51) % 3;  // = 2: correct
v5  = 2^51;
v5r = v5 / 3;
v5s = v5r | 0;
v5t = v5s * 3;
x51z= v5 - v5t;    // = 2


x52 = (2^52) % 3;  // = 2: wrong ????
v5  = 2^52;
v5r = v5 / 3;
v5s = v5r | 0;
v5t = v5s * 3;
x52z= v5 - v5t;    // = 1: correct
weird again.
don't use %

Last edited by mschnell; 09-05-2023 at 06:19 AM.
mschnell is offline   Reply With Quote
Old 09-05-2023, 06:09 AM   #28
Justin
Administrator
 
Justin's Avatar
 
Join Date: Jan 2005
Location: NYC
Posts: 16,223
Default

Quote:
Originally Posted by Tale View Post
Agreed!


Here I disagree... IMHO you should add this to the documentation, because else people will just fill in the blanks using wishful thinking.
I agree, it should be documented as explicitly undefined, even if we do change it for arm64
Justin is offline   Reply With Quote
Old 09-05-2023, 01:23 PM   #29
Tale
Human being with feelings
 
Tale's Avatar
 
Join Date: Jul 2008
Location: The Netherlands
Posts: 3,745
Default

Quote:
Originally Posted by Justin View Post
I agree, it should be documented as explicitly undefined, even if we do change it for arm64
Well, in that case I stand corrected... We do agree!
Tale is offline   Reply With Quote
Old 09-06-2023, 01:32 AM   #30
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default Wrong

Quote:
Originally Posted by mschnell View Post
Even exponents should result in modulo = 1
Odd exponents should result in modulo = 2
Code:
// on 64 Bit Intel

x41 = (2^40) % 3;  // = 1: correct
v4 = 2^40;
x42 = v % 3;       // = 2


v4r = v4 / 3;
v4s = v4r | 0;
v4t = v4s * 3;
x4z = v4 - v4t;    // = 1


x51 = (2^51) % 3;  // = 2: correct
v5  = 2^51;
v5r = v5 / 3;
v5s = v5r | 0;
v5t = v5s * 3;
x51z= v5 - v5t;    // = 2


x52 = (2^52) % 3;  // = 2: wrong ????
v5  = 2^52;
v5r = v5 / 3;
v5s = v5r | 0;
v5t = v5s * 3;
x52z= v5 - v5t;    // = 1: correct
weird again.
don't use %
Sorry, mschnell, there are bugs in your program.

Here is a more complete and correct test:
Code:
desc:JJ-Test-JSFX-04

in_pin:none
out_pin:none

@init
// on 64 Bit Intel - MacBook Pro - REAPER v6.82

x30  = 2^30;
x30r = x30 / 3;
x30s = x30r | 0;
x30t = x30s * 3;
x30z = x30 - x30t;  // 1
x30u = x30 % 3;     // 1
x30m = (2^30) % 3;  // 1

x31  = 2^31;
x31r = x31 / 3;
x31s = x31r | 0;
x31t = x31s * 3;
x31z = x31 - x31t;  // 2
x31u = x31 % 3;     // 2
x31m = (2^31) % 3;  // 2

x32  = 2^32;
x32r = x32 / 3;
x32s = x32r | 0;
x32t = x32s * 3;
x32z = x32 - x32t;  // 1
x32u = x32 % 3;     // 1
x32m = (2^32) % 3;  // 2

x33  = 2^33;
x33r = x33 / 3;
x33s = x33r | 0;
x33t = x33s * 3;
x33z = x33 - x33t;  // 2
x33u = x33 % 3;     // 2
x33m = (2^33) % 3;  // 2

x40  = 2^40;         
x40r = x40 / 3;     
x40s = x40r | 0;
x40t = x40s * 3;
x40z = x40 - x40t;  // 1
x40u = x40 % 3;     // 1
x40m = (2^40) % 3;  // 2

x41  = 2^41;    
x41r = x41 / 3;
x41s = x41r | 0;
x41t = x41s * 3;
x41z = x41 - x41t;  // 2
x41u = x41 % 3;     // 2
x41m = (2^41) % 3;  // 2

x49  = 2^49;
x49r = x49 / 3;
x49s = x49r | 0;
x49t = x49s * 3;
x49z = x49 - x49t;  // 2
x49u = x49 % 3;     // 2
x49m = (2^49) % 3;  // 2

x50  = 2^50;
x50r = x50 / 3;
x50s = x50r | 0;
x50t = x50s * 3;
x50z = x50 - x50t;  // 1
x50u = x50 % 3;     // 1
x50m = (2^50) % 3;  // 2

x51  = 2^51;
x51r = x51 / 3;
x51s = x51r | 0;
x51t = x51s * 3;
x51z = x51 - x51t;  // 2
x51u = x51 % 3;     // 2
x51m = (2^51) % 3;  // 2

x52  = 2^52;
x52r = x52 / 3;
x52s = x52r | 0;
x52t = x52s * 3;
x52z = x52 - x52t;  // 1
x52u = x52 % 3;     // 1
x52m = (2^52) % 3;  // 2

x53  = 2^53;
x53r = x53 / 3;
x53s = x53r | 0;
x53t = x53s * 3;
x53z = x53 - x53t;  // 2
x53u = x53 % 3;     // 2
x53m = (2^53) % 3;  // 2
What this program shows is:

- for all the tested cases (30, 31, 32, 33, 40, 41, 49, 50, 51, 52, 53), both A%B and A-B%floor(A/B) give the correct result, up to 2^53
- there is indeed, as pointed by Justin, a bug is the constant folding for (2^n)%3 where n is > 32. The result in this case is always 2.

Conclusion:
- please correct constant folding algorithm
- don't refrain using the % operator !
jack461 is offline   Reply With Quote
Old 09-06-2023, 03:14 AM   #31
Tale
Human being with feelings
 
Tale's Avatar
 
Join Date: Jul 2008
Location: The Netherlands
Posts: 3,745
Default

Quote:
Originally Posted by jack461 View Post
- there is indeed, as pointed by Justin, a bug is the constant folding for (2^n)%3 where n is > 32.
Nitpicking here, but you mean n >= 32, right?
Tale is offline   Reply With Quote
Old 09-06-2023, 03:16 AM   #32
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default

Quote:
Originally Posted by Tale View Post
Nitpicking here, but you mean n >= 32, right?
Correct !
jack461 is offline   Reply With Quote
Old 09-06-2023, 05:46 AM   #33
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

I did not decently test the dynamic mod (%), because Justin depreciated using it with > 32 bit (and in his example with 2^40 it fails.)
The only weird thing I wanted to mention is that the static (optimized out) mod fails with 2^52 even though Justin said it would use int64 under the hood.

Last edited by mschnell; 09-06-2023 at 10:52 PM.
mschnell is offline   Reply With Quote
Old 09-07-2023, 08:06 AM   #34
Justin
Administrator
 
Justin's Avatar
 
Join Date: Jan 2005
Location: NYC
Posts: 16,223
Default

Quote:
Originally Posted by mschnell View Post
I did not decently test the dynamic mod (%), because Justin depreciated using it with > 32 bit (and in his example with 2^40 it fails.)
The only weird thing I wanted to mention is that the static (optimized out) mod fails with 2^52 even though Justin said it would use int64 under the hood.
oops I meant to say it internally converts to 32-bit, heh. fixing that!
Justin is offline   Reply With Quote
Old 09-07-2023, 08:37 AM   #35
jack461
Human being with feelings
 
jack461's Avatar
 
Join Date: Nov 2013
Location: France
Posts: 185
Default

Quote:
Originally Posted by Justin View Post
oops I meant to say it internally converts to 32-bit, heh. fixing that!
Excellent ! Thank you, Justin.
jack461 is offline   Reply With Quote
Old 09-07-2023, 11:00 AM   #36
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

OK.

But when fixing that, why not have the dynamic modulo algorithm do something like the equivalent of

Code:
q = q | 0;
x = x | 0;
r = x / q;
r = r | 0; 
r = r * q; 
result = x - r;
(or similar (negatives = ???) )

Which should work identically on all archs.

Or convert to int64 and use int64 modulo (not a single instruction on 32 bit archs and on arm64).

Performance issues ?

Last edited by mschnell; 09-07-2023 at 11:06 AM.
mschnell is offline   Reply With Quote
Old 09-07-2023, 11:46 AM   #37
Justin
Administrator
 
Justin's Avatar
 
Join Date: Jan 2005
Location: NYC
Posts: 16,223
Default

We've updated arm64 in the latest build to also support 64-bit. x86, arm32 (and PPC not that anybody uses it) will still use 32-bit. We could support 64-bit on x86 but it gets big and slow due to divide overflow so meh.


doing a lot of extra work doesn't make sense at this point because the operator is ultimately defined by the implementation, if we were starting from scratch we could do it differently. % is EEL2/jsfx is already different than C's implementation, it takes absolute values before the modulus... if the user really cares they can do their own floor(x - floor(x/b)*b) or whatever
Justin is offline   Reply With Quote
Old 09-07-2023, 11:05 PM   #38
mschnell
Human being with feelings
 
mschnell's Avatar
 
Join Date: Jun 2013
Location: Krefeld, Germany
Posts: 16,916
Default

Yep.
Obviously by any reasoning, a modulo on floats is bound to be a weird thing to begin with

I don't think anybody should expect a "pure" way it might work.

Thanks

Last edited by mschnell; 09-08-2023 at 05:20 AM.
mschnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 11:12 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.