Attempt to fix differences between x86 FPU and SSE calculations.
authorErik de Castro Lopo <erikd@mega-nerd.com>
Fri, 21 Mar 2014 08:25:55 +0000 (19:25 +1100)
committerErik de Castro Lopo <erikd@mega-nerd.com>
Fri, 21 Mar 2014 08:26:08 +0000 (19:26 +1100)
The x86 FPU holds intermediate results in larger registers than what
the SSE unit uses, resulting in slighlty different encodings of audio
data. Attempt to fix this by modifying libFLAC/lpc.c to store calculation
results in a FLAC__read before adding it to a sum.

At the moment this works, but I could easily imagine a new version of
the compiler optimising this store to the FLAC__real away leaving us
in the same situation we have now.

Patch-from: Oliver Stöneberg on sourceforge.net
Closes: https://sourceforge.net/p/flac/bugs/409/

src/libFLAC/lpc.c

index 22aab4a..de56f52 100644 (file)
@@ -99,7 +99,7 @@ void FLAC__lpc_compute_autocorrelation(const FLAC__real data[], unsigned data_le
         * this version tends to run faster because of better data locality
         * ('data_len' is usually much larger than 'lag')
         */
-       FLAC__real d;
+       FLAC__real d, tmp;
        unsigned sample, coeff;
        const unsigned limit = data_len - lag;
 
@@ -110,13 +110,17 @@ void FLAC__lpc_compute_autocorrelation(const FLAC__real data[], unsigned data_le
                autoc[coeff] = 0.0;
        for(sample = 0; sample <= limit; sample++) {
                d = data[sample];
-               for(coeff = 0; coeff < lag; coeff++)
-                       autoc[coeff] += d * data[sample+coeff];
+               for(coeff = 0; coeff < lag; coeff++) {
+                       tmp = d * data[sample+coeff];
+                       autoc[coeff] += tmp;
+               }
        }
        for(; sample < data_len; sample++) {
                d = data[sample];
-               for(coeff = 0; coeff < data_len - sample; coeff++)
-                       autoc[coeff] += d * data[sample+coeff];
+               for(coeff = 0; coeff < data_len - sample; coeff++) {
+                       tmp = d * data[sample+coeff];
+                       autoc[coeff] += tmp;
+               }
        }
 }