java - Code inside thread slower than outside thread..? -
i'm trying alter code can work multithreading. stumbled upon performance loss when putting runnable around code.
for clarification: original code, let's call it
//dosomething
got runnable around this:
runnable r = new runnable() { public void run() { //dosomething } }
then submit runnable chachedthreadpool executorservice. first step towards multithreading code, see if code runs fast 1 thread original code.
however, not case. //dosomething executes in 2 seconds, runnable executes in 2.5 seconds. need mention other code, say, //dosomethingelse, inside runnable had no performance loss compared original //dosomethingelse.
my guess //dosomething has operations not fast when working in thread, don't know or what, in aspect difference //dosomethingelse.
could use of final int[]/float[] arrays makes runnable slower? //dosomethingelse code used finals, //dosomething uses more. thing think of.
unfortunately, //dosomething code quite long , out-of-context, post here anyway. know mean shift segmentation algorithm, part of code mean shift vector being calculated each pixel. for-loop
for(int i=0; i<l; i++)
runs through each pixel.
timer.start(); // start timer // initialize mode table used basin of attraction char[] modetable = new char [l]; // (l class property , 100,000) arrays.fill(modetable, (char)0); int[] pointlist = new int [l]; // allcocate memory yk (current vector) double[] yk = new double [ln]; // (ln final int, defined earlier) // allocate memory mh (mean shift vector) double[] mh = new double [ln]; int idxs2 = 0; int idxd2 = 0; (int = 0; < l; i++) { // if mode assigned data point // skip point, otherwise proceed // find mode applying mean shift... if (modetable[i] == 1) { continue; } // initialize point list... int pointcount = 0; // assign window center (window centers // initialized createlattice point // data[i]) idxs2 = i*ln; (int j=0; j<ln; j++) yk[j] = sdata[idxs2+j]; // (sdata earlier defined final float[] of 100,000 items) // calculate mean shift vector using lattice /*****************************************************/ // initialize mean shift vector (int j = 0; j < ln; j++) { mh[j] = 0; } double wsuml = 0; double weight; // find bucket of yk int cbucket1 = (int) yk[0] + 1; int cbucket2 = (int) yk[1] + 1; int cbucket3 = (int) (yk[2] - sminsfinal) + 1; int cbucket = cbucket1 + nbuck1*(cbucket2 + nbuck2*cbucket3); (int j=0; j<27; j++) { idxd2 = buckets[cbucket+bucneigh[j]]; // (buckets final int[] of 75,000 items) // list parse, crt point cheadlist while (idxd2>=0) { idxs2 = ln*idxd2; // determine if inside search window double el = sdata[idxs2+0]-yk[0]; double diff = el*el; el = sdata[idxs2+1]-yk[1]; diff += el*el; //... idxd2 = slist[idxd2]; // (slist final int[] of 100,000 items) } } //... } timer.end(); // stop timer.
there more code, the last while loop first noticed difference in performance.
could think of reason why code runs slower inside runnable original?
thanks.
edit: measured time inside code, excluding startup of thread.
all code runs "inside thread".
the slowdown see caused overhead multithreading adds. try parallelizing different parts of code - tasks should neither large, nor small. example, you'd better off running each of outer loops separate task, rather innermost loops.
there no single correct way split tasks, though, depends on how data looks , target machine looks (2 cores, 8 cores, 512 cores?).
edit: happens if run test repeatedly? e.g., if this:
executor executor = ...; (int = 0; < 10; i++) { final int lap = i; runnable r = new runnable() { public void run() { long start = system.currenttimemillis(); //dosomething long duration = system.currenttimemillis() - start; system.out.printf("lap %d: %d ms%n", lap, duration); } }; executor.execute(r); }
do notice difference in results?
Comments
Post a Comment