Running the same tests on our test machine, I am seeing dramatically different performance results. Our test machine has AVX2, but this can't be the whole story.
Strange. I'm not sure how to explain the results either. Is there something I'm supposed to do to enable icgrep to use AVX2? I followed the build instructions in the README verbatim. Here's my cpu info (which is quite new and does have AVX2):
If the file path is any indication, it looks like I'm using `icgrep1.0`. But the file path also has `icgrep-devel` in it. So I don't know. When I get a chance, I guess I'll try to figure out how to compile the devel version. (It seemed like that was what I was doing, by checking out the source, but maybe not.)
Yes, you have icgrep 1.0. The current development version has about the same build process, sorry that it is such a pain. It is available by svn checkout as follows.
svn co http://parabix.costar.sfu.ca/svn/icGREP/icgrep-devel
AVX2 is autodetected and used if available and enabled by the operating system/hypervisor (Although icgrep1.0 can be compiled to use AVX2, it had some issues).
cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ time rg '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en | wc -l 26464
real 1m8.380s user 1m6.211s sys 0m2.006s cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ time icgrep '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en | wc -l 26464
real 0m11.899s user 0m9.212s sys 0m2.125s cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ time rg -i '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en | wc -l 27370
real 0m31.891s user 0m29.559s sys 0m2.119s
cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ time icgrep -i '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en | wc -l 27370
real 0m14.359s user 0m11.765s sys 0m2.211s
Here is the processor info for (only showing the first processor). cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 61 model name : Intel(R) Core(TM) i3-5010U CPU @ 2.10GHz stepping : 4 microcode : 0x16 cpu MHz : 2076.621 cache size : 3072 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 20 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap bogomips : 4189.93 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: