Christophe Gisquet
76c7277385
x86: sbrdsp: implement SSE2 hf_apply_noise
...
233 to 105 cycles on Arrandale and Win64.
Replacing the multiplication by s_m[m] by a pand and a pxor with
appropriate vectors is slower. Unrolling is a 15 cycles win.
A SSE version was 4 cycles slower.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-04-19 13:19:45 +02:00
..
2013-03-28 11:20:41 +01:00
2013-03-13 14:18:53 +01:00
2013-04-11 12:32:29 +02:00
2013-04-15 12:17:39 +03:00
2013-04-15 12:17:39 +03:00
2013-03-26 04:08:28 +01:00
2013-03-13 03:59:23 +01:00
2013-04-08 12:38:33 +03:00
2013-04-11 15:56:18 +02:00
2013-04-11 15:56:18 +02:00
2013-04-12 22:22:27 +02:00
2013-04-10 11:04:05 +03:00
2013-04-10 11:04:05 +03:00
2013-04-10 11:03:06 +03:00
2013-04-11 11:53:19 +02:00
2013-04-12 22:36:31 +02:00
2013-03-13 03:59:23 +01:00
2013-03-22 13:00:50 +01:00
2013-03-13 03:59:23 +01:00
2013-04-08 12:38:33 +03:00
2013-04-11 12:32:29 +02:00
2013-03-22 22:57:23 +01:00
2013-03-13 03:59:23 +01:00
2013-03-13 14:18:53 +01:00
2013-04-19 13:19:45 +02:00
2013-04-19 13:19:45 +02:00
2013-03-27 11:32:45 +01:00
2013-03-28 11:20:41 +01:00
2013-03-12 22:54:10 +01:00
2013-04-16 00:44:20 +02:00
2013-03-28 11:20:41 +01:00
2013-04-10 11:04:05 +03:00
2013-03-23 23:37:27 +02:00