Friday 16 December 2011

What is being cached?

What is being cached?

I am trying to time a chunk of code. If I run it from the command line like:
Code:

taskset -c 0 ./conv
Conv Time:  0.014350082

I get a time of ~14ms but there is some variation. So, I put the command above into a script and ran to get an average
Code:

taskset -c 0 ./conv
taskset -c 0 ./conv
.
.
.
taskset -c 0 ./conv

The output I get from running the script is:
Code:

Conv Time:  0.014284663
Conv Time:  0.010995277
Conv Time:  0.008977601
Conv Time:  0.008802842
Conv Time:  0.009022121
Conv Time:  0.008894075
Conv Time:  0.008738924
Conv Time:  0.008423263
Conv Time:  0.008694661
Conv Time:  0.008902102

What is causing the increased performance I see?

If the program is starting and exiting, what could be cached?
I also ran another version that I moved the program to a new core each time in the script.
Code:

taskset -c 0 ./conv
taskset -c 1 ./conv
.
.
.
taskset -c 9 ./conv

That gave an output of:
Code:

Conv Time:  0.014503283
Conv Time:  0.013998719
Conv Time:  0.013912381
Conv Time:  0.014000417
Conv Time:  0.013930824
Conv Time:  0.013995235
Conv Time:  0.014705306
Conv Time:  0.013931455
Conv Time:  0.014027264
Conv Time:  0.014017024

Give that, something is helping the performance if I say on the same core.
Here is the code below
Code:


// Use:
//        g++ -o conv -O3 conv.cpp -lrt
// to compile

#include <iostream>
#include <cstdio>
#include <sys/time.h>

using namespace std;


typedef struct {
    float real;
    float imag;
} Complex_T;

#define TAPS  228
#define SAMPS 341
int main(int argc, char **argv)
{

  Complex_T y[TAPS+SAMPS],y2[TAPS+SAMPS],h[TAPS],x[SAMPS];
  int i(0),j(0);

  struct timeval val1,val2;
  struct timespec a,b;

  double thetime1,thetime2;

  clock_gettime(CLOCK_REALTIME,&a);

  for (int k(0); k < 100; k++)
  for ( i = 0; i < TAPS+SAMPS; i++ ) {
    y[i].real = 0;                      // set to zero before sum
    y[i].imag = 0;                      // set to zero before sum
    for ( j = 0; j < TAPS; j++ ){
        if ( i-j < 0 )
          break;
        if ( i-j >= SAMPS )
          continue;
        y[i].real += h[j].real * x[i - j].real;    // convolve: multiply and accumulate
        y[i].imag += h[j].real * x[i - j].imag;    // convolve: multiply and accumulate
        y[i].imag += h[j].imag * x[i - j].real;    // convolve: multiply and accumulate
        y[i].real += -1.0*(h[j].imag * x[i - j].imag);    // convolve: multiply and accumulate
    }
  }

  clock_gettime(CLOCK_REALTIME,&b);

  thetime1 = (double)((double)a.tv_sec+(double)(a.tv_nsec)/1.0e9);
  thetime2 = (double)((double)b.tv_sec+(double)(b.tv_nsec)/1.0e9);

  printf("Conv Time:  %.9f\n",(thetime2-thetime1));

  return 0;
}

No comments:

Post a Comment