Kernel documentation :
The GPU scheduler provides entities which allow userspace to push jobs into software queues which are then scheduled on a hardware run queue. The software queues have a priority among them. The scheduler selects the entities from the run queue using a FIFO. The scheduler provides dependency handling features among jobs. The driver is supposed to provide callback functions for backend operations to the scheduler like submitting a job to hardware run queue, returning the dependencies of a job etc.
The organisation of the scheduler is the following:
1. Each hw run queue has one scheduler
2. Each scheduler has multiple run queues with different priorities (e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL)
3. Each scheduler run queue has a queue of entities to schedule
4. Entities themselves maintain a queue of jobs that will be scheduled on the hardware.
The jobs in a entity are always scheduled in the order that they were pushed.
Principle overview :
as everyone knows , modern GPU to CPU Provides command flow (command stream) Interface , And these command flows are used to control GPU Hardware , Issue coloring program , And deliver OpenGL or vulkan The required state values, etc .
linux Medium GPU scheduler It's for GPU Command flow scheduling , This part of the code is from AMD GPU driver It's independent of .
From the description of the kernel documentation ,GPU scheduler Provides the user program with entities, Can be used for user programs to submit jobs, these jobs Be added to first software queue On , And then through the scheduler to hardware(GPU) On .
GPU The last command flow channel corresponds to one GPU scheduler.
GPU scheduler There are two scheduling strategies , One is scheduling by priority , The second is the first queue first scheduling under the same priority , namely FIFO Pattern .GPU scheduler Through the callback function method to achieve different hardware jobs Submit .
stay jobs Before being committed to hardware GPU scheduler Provides dependency checking features , Only when jobs When all dependencies of are available , Will be submitted to the hardware .
GPU The last command flow channel corresponds to one gpu scheduler, One gpu scheduler There are multiple run queue, these run queue It represents different priorities .
When there is a new job Need to submit to GPU Upper time , First submitted to entities On , Submitted entity Through load balancing algorithm , Determine the entity It will eventually be dispatched to gpu scheduler, And put entity Add to the selected gpu scheduler Of run Queue List of , Waiting to be scheduled .
Basic usage :
1.GPU scheduler Before you can work , It needs to be initialized , And provide callback function related to hardware operation , and GPU A command flow channel on the corresponds to a scheduler
2. stay scheduler There is software run queue, these run queue Corresponding to different priorities , High priority run queue Priority scheduling
3. new job First submitted to entity On , Then entity Be added to scheduler Of run queue Of the queue . At the same priority ,job and entity Schedule according to FIFO rule (FIFO)
4. When scheduler When scheduling begins , First, from the highest priority run queue Choose the first to enter entity, And then from the elected entity in , Choose the first to join job
5. In a job Can be submitted to GPU HW forward , Need to do dependency detection , For example, soon render Of framebuffer Is it available ( Dependency detection is also implemented through callback functions )
6.scheduler Elected by the job, Finally, we need to use the callback function implemented in initialization , take job Submitted to GPU Of hardware run queue On
7. stay GPU Deal with one job after , adopt dma_fence Of callback notice GPU scheduler, and signal finish fence
1. register sched
struct drm_gpu_scheduler;
int drm_sched_init(struct drm_gpu_scheduler *sched, //sched: scheduler instance
const struct drm_sched_backend_ops *ops, //ops: backend operations for this scheduler
unsigned hw_submission, //hw_submission: number of hw submissions that can be in flight
unsigned hang_limit, //hang_limit: number of times to allow a job to hang before dropping it
long timeout, //timeout: timeout value in jiffies for the scheduler
const char *name) //name: name used for debugging
By function drm_sched_init() Finish one struct drm_gpu_scheduler *sched Initialization work .
Once it's done , Will start a kernel thread , This kernel thread implements the main scheduling logic .
The kernel thread is waiting , Until something new job Be submitted to , And wake up the scheduling job .
hw_submission: Appoint GPU The number of commands that can be submitted simultaneously by a single command channel on the .
By function drm_sched_init() Initialize a GPU scheduler. Where, through the parameter const struct drm_sched_backend_ops *ops Provide platform related callback interface .
struct drm_sched_backend_ops {
struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
struct drm_sched_entity *s_entity);
struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
void (*timedout_job)(struct drm_sched_job *sched_job);
void (*free_job)(struct drm_sched_job *sched_job);
dependency: When one job As the next scheduling object , The interface is called . If it's time to job Dependency exists , Need to return one dma_fence The pointer ,GPU scheduler It'll be back here dma_fence Of callback list Add wake-up action to , Once it's time to fence By signal, Can wake up again GPU scheduler. If there are no dependencies , Then return to NULL.
run_job: once job After all the dependencies of become available , The interface is called . This interface mainly implements GPU HW Related command submission . The interface successfully submitted the command to GPU After the , Return to one dma_fence,gpu scheduler It's going to be to this dma_fence Of callback list Add finish fence Wake up operation , And this dma_fence Usually in GPU It's time to job After being signal.
timedout_job: When one submits to GPU When the execution time is too long , The interface will be called , To trigger GPU Perform the recovery process .
free_job: Use as a pawn job Related resource release work after being processed .
2. initialization entities
int drm_sched_entity_init(struct drm_sched_entity *entity,
enum drm_sched_priority priority,
struct drm_gpu_scheduler **sched_list,
unsigned int num_sched_list,
atomic_t *guilty)
Initialize a struct drm_sched_entity *entity.
priority: Appoint entity The priority of the , The priorities currently supported are ( From low to high ):
sched_list:entity in job Can be submitted gpu scheduler list . When gpu There are multiple gpu Command flow channel , This is the same job There are multiple potential channels that can be submitted (HW relevant , need gpu Support ),sched_list These potential channels are preserved in .
When one entity There are many. Gpu scheduler when ,drm scheduler Support load balancing algorithm .
num_sched_list: It specifies sched_list in gpu scheduler The number of .
function drm_sched_entity_init() Can be found in open() Called in function , So when multiple applications call their own open() After the function ,driver One will be created for each application entity.
As mentioned earlier ,entity yes job The submission point of ,gpu A command flow channel on the corresponds to a gpu scheduler, When there are multiple applications running to the same gpu Command flow channel commit job when ,job First added to the respective entity On , Wait for gpu scheduler The unified scheduling of .
3. initialization job
int drm_sched_job_init(struct drm_sched_job *job,
struct drm_sched_entity *entity,
void *owner)
entity: Appoint job Will be submitted to entity. If entity Of sched_list More than one , Will call the load balancing algorithm , from entity Of sched_list Choose the best of gpu scheduler Conduct job Dispatch .
function drm_sched_job_init() It's time to job Initialize two dma_fence:scheduled and finished, When scheduled fence By signaled, Indicates that the job To be sent to GPU On , When finished fence By signaled, Indicates that the job Already in gpu I'm done with it .
So through these two fence You can tell the outside world job Current state .
4. Submit job

void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
struct drm_sched_entity *entity)
When one job By drm_sched_job_init() After the initialization , You can use the function drm_sched_entity_push_job() Submitted to the entity Of job_queue Yes .
If entity It's the first time it's been submitted job On top of it job_queue On , The entity Will be added to gpu scheduler Of run queue On , And wake up gpu scheduler Scheduling thread on .
5.dma_fence The role of
DMA fence yes linux For different kernel modules DMA Primitives for synchronous operations , Commonly used in GPU rendering、displaying buffer And so on .
Use DMA FENCE It can reduce the waiting time in user mode , Let the synchronization of data take place in the kernel . for example GPU rendering and displaying buffer Synchronization of ,GPU rendering Responsible for providing framebuffer Write render data to ,displaying Responsible for frambuffer The data is displayed on the screen . that displaying Need to wait GPU rendering After completion , To read framebuffer Data in ( And vice versa ,gpu rendering You need to wait displaying When the display is finished , Can be in framebuffer Draw the next image on the screen ). We can synchronize the two in the application , Immediate wait rendering After the end , Only called displaying modular . Applications tend to sleep while waiting , You can't do anything else ( Like preparing for the next frame framebuffer Rendering of ). etc. displaying When the display is finished , Call again GPU rendering Also can make gpu Unsaturated , At rest .
Use DMA fence after , take GPU rendering and displaying Put the synchronization in the kernel , Application calls GPU rendering after , Return to one out fence, Don't have to etc. GPU rendering Completion , You can call displaying, And put GPU rendering Of out fence Pass to displaying Module as in fence. So in the kernel displaying The module will wait for display output in fence By signal, And once GPU rendering After completion , will signal The fence. No more application involvement .
Please refer to this article for more detailed explanation :
stay Gpu scheduler in , One job Before being scheduled, determine whether it has dependencies , One job After being dispatched, you need to inform the outside world of your current state , Both are through fence To achieve .
job The dependency of is called in_fence,job Self status report is called out_fence, It's essentially the same data structure .
When it exists in fence And not in signaled state , The job Need to continue to wait ( The way to wait here is not by calling dma_fence_wait()), Until all in fence By signal. Before waiting Gpu scheduler A new callback interface will be registered dma_fence_cb To in fence On ( It includes arousal Gpu scheduler Scheduler code ), When in fence By signal when , This callback Will be called , In which to awaken again to the job The scheduling .
One job There are two out fence:scheduled and finished, When scheduled fence By signaled, Indicates that the job To be sent to GPU On , When finished fence By signaled, Indicates that the job Already in gpu I'm done with it .
In the following VC4 Test code on :
Vc4 driver Not used drm Of gpu scheduler As a scheduler .VC4 stay in fence In terms of treatment, it is blocking Type , That is, the application will block here , This doesn't seem to fit fence The purpose of . And the current driver Also, it only supports single in fence, and vulkan Multiple dependencies can be passed in on .
The actual test found that , With the original driver Compared with gpu scheduler No significant performance change , But it can solve fence The problem of , Of course, the main purpose is to practice , It mainly refers to v3d driver Code for .
1. The first is to call drm_sched_init() establish scheduler.Vc4 There are two command entries on the ,bin and render, So here we create two scheduler.
 1 static const struct drm_sched_backend_ops vc4_bin_sched_ops = {
2 .dependency = vc4_job_dependency,
3 .run_job = vc4_bin_job_run,
4 .timedout_job = NULL,
5 .free_job = vc4_job_free,
6 };
8 static const struct drm_sched_backend_ops vc4_render_sched_ops = {
9 .dependency = vc4_job_dependency,
10 .run_job = vc4_render_job_run,
11 .timedout_job = NULL,
12 .free_job = vc4_job_free,
13 };
15 int vc4_sched_init(struct vc4_dev *vc4)
16 {
17 int hw_jobs_limit = 1;
18 int job_hang_limit = 0;
19 int hang_limit_ms = 500;
20 int ret;
22 ret = drm_sched_init(&vc4->queue[VC4_BIN].sched,
23 &vc4_bin_sched_ops,
24 hw_jobs_limit,
25 job_hang_limit,
26 msecs_to_jiffies(hang_limit_ms),
27 "vc4_bin");
28 if (ret) {
29 dev_err(vc4->, "Failed to create bin scheduler: %d.", ret);
30 return ret;
31 }
33 ret = drm_sched_init(&vc4->queue[VC4_RENDER].sched,
34 &vc4_render_sched_ops,
35 hw_jobs_limit,
36 job_hang_limit,
37 msecs_to_jiffies(hang_limit_ms),
38 "vc4_render");
39 if (ret) {
40 dev_err(vc4->, "Failed to create render scheduler: %d.", ret);
41 vc4_sched_fini(vc4);
42 return ret;
43 }
45 return ret;
46 }
2. stay drm driver Of open Add... To the callback interface ,entity Initial code of .
 1 static int vc4_open(struct drm_device *dev, struct drm_file *file)
2 {
3 struct vc4_dev *vc4 = to_vc4_dev(dev);
4 struct vc4_file *vc4file;
5 struct drm_gpu_scheduler *sched;
6 int i;
8 vc4file = kzalloc(sizeof(*vc4file), GFP_KERNEL);
9 if (!vc4file)
10 return -ENOMEM;
12 vc4_perfmon_open_file(vc4file);
14 for (i = 0; i < VC4_MAX_QUEUES; i++) {
15 sched = &vc4->queue[i].sched;
16 drm_sched_entity_init(&vc4file->sched_entity[i],
18 &sched, 1,
19 NULL);
20 }
22 file->driver_priv = vc4file;
24 return 0;
25 }
3. stay driver It's done job After you pack it , You can go to entity Submit on file job 了 .
 1 static void vc4_job_free(struct kref *ref)
2 {
3 struct vc4_job *job = container_of(ref, struct vc4_job, refcount);
4 struct vc4_dev *vc4 = job->dev;
5 struct vc4_exec_info *exec = job->exec;
6 struct vc4_seqno_cb *cb, *cb_temp;
7 struct dma_fence *fence;
8 unsigned long index;
9 unsigned long irqflags;
11 xa_for_each(&job->deps, index, fence) {
12 dma_fence_put(fence);
13 }
14 xa_destroy(&job->deps);
16 dma_fence_put(job->irq_fence);
17 dma_fence_put(job->done_fence);
19 if (exec)
20 vc4_complete_exec(&job->dev->base, exec);
22 spin_lock_irqsave(&vc4->job_lock, irqflags);
23 list_for_each_entry_safe(cb, cb_temp, &vc4->seqno_cb_list, work.entry) {
24 if (cb->seqno <= vc4->finished_seqno) {
25 list_del_init(&cb->work.entry);
26 schedule_work(&cb->work);
27 }
28 }
30 spin_unlock_irqrestore(&vc4->job_lock, irqflags);
32 kfree(job);
33 }
35 void vc4_job_put(struct vc4_job *job)
36 {
37 kref_put(&job->refcount, job->free);
38 }
40 static int vc4_job_init(struct vc4_dev *vc4, struct drm_file *file_priv,
41 struct vc4_job *job, void (*free)(struct kref *ref), u32 in_sync)
42 {
43 struct dma_fence *in_fence = NULL;
44 int ret;
46 xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
48 if (in_sync) {
49 ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, &in_fence);
50 if (ret == -EINVAL)
51 goto fail;
53 ret = drm_gem_fence_array_add(&job->deps, in_fence);
54 if (ret) {
55 dma_fence_put(in_fence);
56 goto fail;
57 }
58 }
60 kref_init(&job->refcount);
61 job->free = free;
63 return 0;
65 fail:
66 xa_destroy(&job->deps);
67 return ret;
68 }
70 static int vc4_push_job(struct drm_file *file_priv, struct vc4_job *job, enum vc4_queue queue)
71 {
72 struct vc4_file *vc4file = file_priv->driver_priv;
73 int ret;
75 ret = drm_sched_job_init(&job->base, &vc4file->sched_entity[queue], vc4file);
76 if (ret)
77 return ret;
79 job->done_fence = dma_fence_get(&job->base.s_fence->finished);
81 kref_get(&job->refcount);
83 drm_sched_entity_push_job(&job->base, &vc4file->sched_entity[queue]);
85 return 0;
86 }
88 /* Queues a struct vc4_exec_info for execution. If no job is
89 * currently executing, then submits it.
90 *
91 * Unlike most GPUs, our hardware only handles one command list at a
92 * time. To queue multiple jobs at once, we'd need to edit the
93 * previous command list to have a jump to the new one at the end, and
94 * then bump the end address. That's a change for a later date,
95 * though.
96 */
97 static int
98 vc4_queue_submit_to_scheduler(struct drm_device *dev,
99 struct drm_file *file_priv,
100 struct vc4_exec_info *exec,
101 struct ww_acquire_ctx *acquire_ctx)
102 {
103 struct vc4_dev *vc4 = to_vc4_dev(dev);
104 struct drm_vc4_submit_cl *args = exec->args;
105 struct vc4_job *bin = NULL;
106 struct vc4_job *render = NULL;
107 struct drm_syncobj *out_sync;
108 uint64_t seqno;
109 unsigned long irqflags;
110 int ret;
112 spin_lock_irqsave(&vc4->job_lock, irqflags);
114 seqno = ++vc4->emit_seqno;
115 exec->seqno = seqno;
117 spin_unlock_irqrestore(&vc4->job_lock, irqflags);
119 render = kcalloc(1, sizeof(*render), GFP_KERNEL);
120 if (!render)
121 return -ENOMEM;
123 render->exec = exec;
125 ret = vc4_job_init(vc4, file_priv, render, vc4_job_free, args->in_sync);
126 if (ret) {
127 kfree(render);
128 return ret;
129 }
131 if (args->bin_cl_size != 0) {
132 bin = kcalloc(1, sizeof(*bin), GFP_KERNEL);
133 if (!bin) {
134 vc4_job_put(render);
135 return -ENOMEM;
136 }
138 bin->exec = exec;
140 ret = vc4_job_init(vc4, file_priv, bin, vc4_job_free, args->in_sync);
141 if (ret) {
142 vc4_job_put(render);
143 kfree(bin);
144 return ret;
145 }
146 }
148 mutex_lock(&vc4->sched_lock);
150 if (bin) {
151 ret = vc4_push_job(file_priv, bin, VC4_BIN);
152 if (ret)
153 goto FAIL;
155 ret = drm_gem_fence_array_add(&render->deps, dma_fence_get(bin->done_fence));
156 if (ret)
157 goto FAIL;
158 }
160 vc4_push_job(file_priv, render, VC4_RENDER);
162 mutex_unlock(&vc4->sched_lock);
164 if (args->out_sync) {
165 out_sync = drm_syncobj_find(file_priv, args->out_sync);
166 if (!out_sync) {
167 ret = -EINVAL;
168 goto FAIL;;
169 }
171 drm_syncobj_replace_fence(out_sync, &bin->base.s_fence->scheduled);
172 exec->fence = render->done_fence;
174 drm_syncobj_put(out_sync);
175 }
177 vc4_update_bo_seqnos(exec, seqno);
179 vc4_unlock_bo_reservations(dev, exec, acquire_ctx);
181 if (bin)
182 vc4_job_put(bin);
183 vc4_job_put(render);
185 return 0;
187 FAIL:
188 return ret;
189 }
Reference material :

linux DRM GPU scheduler More articles about notes

  1. Linux DRM KMS Introduction to driving 【 turn 】

    from : Whoops, I finished last time <Linux DRM Graphic ...

  2. linux DRM/KMS Testing tools modetest、kmscude、igt-gpu-tools ( One )

    Here are a few things to learn Linux DRM/KMS Some of the tools used in ,modetest.kmscude.igt-gpu-tools. brief introduction : modetest By libdrm Test program provided , You can query the display device ...

  3. linux DRM/KMS Testing tools modetest、kmscude、igt-gpu-tools ( Two )

    kmscube   kmscube is a little demonstration program for how to drive bare metal graphics without a c ...

  4. linux 2.6 Drive notes ( One )

    Article as linux 2.6 Drive notes , Record environment construction and linux Basic kernel modules are compiled and loaded . Environment building : Hardware :OK6410 Development board Target board operating system :linux 2.6 Cross compile environment :windows 7 + v ...

  5. Linux Kernel analysis course notes ( One )

    linux Kernel analysis course notes ( One ) Von Neumann architecture Von Neumann architecture is actually stored program computer . On two levels : From a hardware point of view , The von Neumann architecture can be logically abstracted as CPU And memory , Connected by bus .CPU On ...

  6. Linux Interprocess communication IPC Learning notes 2 (SVR4 Semaphore )

    Linux Interprocess communication IPC Learning notes 2 (SVR4 Semaphore )

  7. Linux Interprocess communication IPC Learning notes 2 (Posix Semaphore )

    Linux Interprocess communication IPC Learning notes 2 (Posix Semaphore )

  8. Linux Interprocess communication IPC Message queue for learning notes (SVR4)

    Linux Interprocess communication IPC Message queue for learning notes (SVR4)

  9. Linux Interprocess communication IPC The famous channel of learning notes

    Basic knowledge of : Famous pipeline ,FIFO fifo , It's a one-way street ( Half duplex ) The flow of data , Unlike pipes : It's the original Unix IPC form , It goes back to 1973 Year of Unix The first 3 edition . Two points should be paid attention to when using it : 1) There is a name associated with the pathname ...

  10. Linux Interprocess communication IPC The channel of learning notes

    Basic knowledge of : The pipe is the original Unix IPC form , It goes back to 1973 Year of Unix The first 3 edition . Two points should be paid attention to when using it : 1) Isn't there a name : 2) For process communication between common ancestors : 3) The function of reading and writing exercises read and write function #incl ...

Random recommendation

  1. shell Script planning template

    shell Script planning template Linux During operation and maintenance ,shell Script is an indispensable tool , But the programming habits of every operation and maintenance personnel are different , Most of the time, it's just to achieve a certain function , All the scripts are crappy . Scripts have to be standardized , It should be from the next few years ...

  2. RunLoop

    One . What is? RunLoop Understand... Literally : Operating cycle . Running circle . Basic function : Keep the program ( application ) Continuous operation of . The handler (APP) All kinds of events in ( such as : Touch event . Timing events .Selector Events, etc. ) save CPU resources ...

  3. IE9 Next WebUploader Cross domain problem of uploading pictures

    As front-end , This time stepping backstage xml Configuration pit . IE9 Next upload pictures through flash plug-in unit , Keep sending request , Status code for 404, The reason is to upload pictures ...

  4. ACM 6174 problem

    6174 problem The time limit :1000 ms  |  Memory limit :65535 KB difficulty :2   describe Suppose you have a four digit number with different numbers , Sort all the numbers from big to small a, From small to big b, And then use a-b On behalf of ...

  5. Python 2.7_First_try_ Climb the sunshine movie net _20161206

    I've seen it before Scrapy Framework building project crawling It's used in web page parsing Xpath Page elements for parsing This time try to use select Methods match elements 1. The portal crawls the page ...

  6. Qt And QAbstractItemView Right-click menu

    One . Function Overview Speaking of the right-click menu , Before Qt It's custom QLineEdit Right click menu I've talked about in this article 3 There are two ways to realize the right-click menu , I'm also talking about it today , in the light of QListWidget Class is customizing the right-click menu , The specific way I use it ...

  7. linux A firewall Basic knowledge of

    turn iptables Introduce linux The packet filtering function of , namely linux A firewall , It consists of netfilter and iptables Two components make up . ...

  8. mvc Paging generates static pages ,mvc Generate static pages Paging generates static pages ...

  9. Android 2D Game development settings

  10. tasklet And work queues

    tasklet Mechanisms and work queues tasklet principle ...