Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Running the same tests on our test machine, I am seeing dramatically different performance results. Our test machine has AVX2, but this can't be the whole story.

cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ time rg '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en | wc -l 26464

real 1m8.380s user 1m6.211s sys 0m2.006s cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ time icgrep '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en | wc -l 26464

real 0m11.899s user 0m9.212s sys 0m2.125s cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ time rg -i '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en | wc -l 27370

real 0m31.891s user 0m29.559s sys 0m2.119s

cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ time icgrep -i '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en | wc -l 27370

real 0m14.359s user 0m11.765s sys 0m2.211s

Here is the processor info for (only showing the first processor). cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 61 model name : Intel(R) Core(TM) i3-5010U CPU @ 2.10GHz stepping : 4 microcode : 0x16 cpu MHz : 2076.621 cache size : 3072 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 20 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap bogomips : 4189.93 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management:



Strange. I'm not sure how to explain the results either. Is there something I'm supposed to do to enable icgrep to use AVX2? I followed the build instructions in the README verbatim. Here's my cpu info (which is quite new and does have AVX2):

    processor       : 0
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 79
    model name      : Intel(R) Core(TM) i7-6900K CPU @ 3.20GHz
    stepping        : 1
    microcode       : 0xb00001d
    cpu MHz         : 1267.578
    cache size      : 20480 KB
    physical id     : 0
    siblings        : 16
    core id         : 0
    cpu cores       : 8
    apicid          : 0
    initial apicid  : 0
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 20
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
    bugs            :
    bogomips        : 6398.91
    clflush size    : 64
    cache_alignment : 64
    address sizes   : 46 bits physical, 48 bits virtual
    power management:


Are you using icgrep1.0? That may explain it.

My reports are from our current development version r5163.

  cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ perf stat -e instructions:u,cycles:u,branch-misses:u icgrep1.0 -i -c '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en 
  27370
  Performance counter stats for 'icgrep1.0 -i -c \w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade OpenSubtitles2016.raw.en':
   252,725,532,395      instructions:u            #    2.75  insns per cycle        
    91,867,444,975      cycles:u                 
       283,661,301      branch-misses:u                                             
      46.570725331 seconds time elapsed

   
  cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ perf stat -e instructions:u,cycles:u,branch-misses:u rg -i -c '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en 
  27370
  Performance counter stats for 'rg -i -c \w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade OpenSubtitles2016.raw.en':
    84,296,004,027      instructions:u            #    1.38  insns per cycle        
    61,298,903,577      cycles:u                 
           510,918      branch-misses:u                                             
      31.962195024 seconds time elapsed

  cameron@cs-osl-10:~/ripgrep/datadir/subtitles$ perf stat -e instructions:u,cycles:u,branch-misses:u icgrep -i -c '\w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade' OpenSubtitles2016.raw.en 
  27370
  Performance counter stats for 'icgrep -i -c \w+ Holmes|\w+ Watson|\w+ Adler|\w+ Moriarty|\w+ Lestrade OpenSubtitles2016.raw.en':
    42,064,581,840      instructions:u            #    1.94  insns per cycle        
    21,723,251,095      cycles:u                 
        47,953,756      branch-misses:u                                             
      13.301160493 seconds time elapsed


    [andrew@Cheetah icgrep-build] pwd
    /home/andrew/clones/icgrep1.0/icgrep-devel/icgrep-build
    [andrew@Cheetah icgrep-build] ./icgrep --version
    LLVM (http://llvm.org/):
      LLVM version 3.5.0svn
      Optimized build.
      Built Sep 24 2016 (11:27:32).
      Default target: x86_64-unknown-linux-gnu
      Host CPU: x86-64
If the file path is any indication, it looks like I'm using `icgrep1.0`. But the file path also has `icgrep-devel` in it. So I don't know. When I get a chance, I guess I'll try to figure out how to compile the devel version. (It seemed like that was what I was doing, by checking out the source, but maybe not.)


Yes, you have icgrep 1.0. The current development version has about the same build process, sorry that it is such a pain. It is available by svn checkout as follows.

  svn co http://parabix.costar.sfu.ca/svn/icGREP/icgrep-devel
AVX2 is autodetected and used if available and enabled by the operating system/hypervisor (Although icgrep1.0 can be compiled to use AVX2, it had some issues).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: