kernel - How to run a multi-queue code using OpenCL? -


for example, i'm doing image processing work on every frame of video.

every frame's processing using 200ms including writing, processing , reading. , fps 25, in case every 2 frames' distance 40ms. processing slow show continuous result.

so here idea, use multi-queues work.

in cpu part,

while(video not over) {    1. read frame0;     processing **frame0** using **queue0**;     wait 40 ms;     2. read frame1;     processing frame1 using **queue1**;     wait 40 ms;      3.4.5.      ...(after 5 frames(just 200ms's processing time))       6. download **frame0**'s result.      7. read frame5;     processing frame5 using **queue0**;     wait 40 ms;     ... } 

the code means that, use different queues reading , processing same frame in video.

however, experiment result faster, 2 times faster, not in imaginary speed.

can tell me how deal it? thx!

assuming have 1 device, here thoughts on point:

  • main reason have multiple command queues (cq) per single opencl device ability execute kernels & io operations simultaneously.
  • usually 1 cq enought load single device @ ~100%. though, multi-cq idea (in opinion), you're feeding gpu workload.
  • look @ kernel execution time. may be, it's big enough, device executing kernels & can't go faster.
  • i think, don't need wait 40ms. solution process frames in queue, in put eliminate difference between bitstream & display order.
  • if have many cq, opencl driver thread busy maintaining them, performance may decrease.

Comments

Popular posts from this blog

javascript - jquery or ashx not working -

opencv - DataType<cv::detail::deriv_type>::depth what is it used for -

python 3.x - Mapping specific letters onto a list of words -