performance - Primitives revisited -
i aware of stack overflow question what primitive forth operators?, doesn't address question. looking not minimal rather practical set of primitives.
recently faced problem required sorting quite large arrays, , performance became critical. naive qsort benchmarked @ 20. porting heavily (algorithmically) optimized stl version gain me benchmark 16. native c++ laughed @ me benchmark 3. oh well.
finally bit bullet , implemented exch ( a1 a2 -- a1 a2 )
, non-destructive compares ( n1 n2 -- n1 n2 flag )
primitives. results amazing - three-fold performance gain. still not c++, way closer.
why doesn't standard forth have them out of box?
ps: benchmark (execution time, nsec)/(n log n)
the effect of such changes depend heavily on quality of forth system. worse compiler is, more effect well-thought out changes have. on other hand, more difficult shave off 1 cycle of 4, 10 cycles of 40. means @ point high-level rewrites not pay off anymore (unless compiler writer :-)
there of course tricks multi-threading , special cpu instructions 1 might experiment with.
to see are, helpful if provide actual code , timings on real system.
Comments
Post a Comment