The anatomy of a linux read call, part 2

Frits Hoogland - Dec 20 '21 - - Dev Community

This blogpost is about the internals of a linux pread (pread64) call, which both Postgresql and YugabyteDB are using. In the previous post I've set the stage of the mandatory steps and thoughts for understanding IO performance.

The essence is that there are several layers that an IO call (pread64()) has to cross from user land before the actual IO is executed on the physical device. The previous post described the execution flow of a read call that got serviced by the linux page cache.

This part takes a look into what happens when physical does actually happen. For that, we use the executable generated by pread.c from the previous post.

Setup ftrace and perform the pread.

In order to have the pread executable execute an actual physical IO, we have to remove it from the page cache. This can be done using the drop_caches file in the proc filesystem at /proc/sys/vm/drop_caches. Obviously this doesn't remove only the pages from the 'tt' file that is used, but also any other cached page that is not currently being used. This means doing this on a busy server is not a good idea.

First lets setup ftrace. I am doing it in a complete way to make the ftrace part be complete in this post.

Turn ftrace off for completeness sake:

echo 0 > tracing_on
Enter fullscreen mode Exit fullscreen mode

Set the current tracer to 'function_graph':

echo function_graph > current_tracer
Enter fullscreen mode Exit fullscreen mode

Enable tracing for the pread64 system call:

echo ksys_pread64 > set_graph_function
Enter fullscreen mode Exit fullscreen mode

Trace annotation: add the program executable name and pid in the trace (funcgraph-proc), and add the name of the function to the end too (funcgraph-tail).

echo funcgraph-tail > trace_options
echo funcgraph-proc > trace_options
Enter fullscreen mode Exit fullscreen mode

Now flush the page cache:

echo 1 > /proc/sys/vm/drop_caches
Enter fullscreen mode Exit fullscreen mode

Turn tracing on:

echo 1 > tracing_on
Enter fullscreen mode Exit fullscreen mode

Now go the other session and execute pread:

./pread
called pread( 3, c, 10). returned that 2 bytes were read.
Those bytes are as follows: x
Enter fullscreen mode Exit fullscreen mode

Turn tracing off:

echo 0 > tracing_on
Enter fullscreen mode Exit fullscreen mode

The result can be seen below at 'Complete physical IO trace', but is really long. Therefore for the sake of readability, I show some snippets.

The start of the system call is identical to the start of the traced pread64 call in the previous blog:

 3)    pr-5848     |               |  ksys_pread64() {
 3)    pr-5848     |               |    __fdget() {
...
 3)    pr-5848     |               |                  pagecache_get_page() {
 3)    pr-5848     |   0.115 us    |                    find_get_entry();
 3)    pr-5848     |   0.463 us    |                  } /* pagecache_get_page */
 3)    pr-5848     |               |                  page_cache_sync_readahead() {
Enter fullscreen mode Exit fullscreen mode

The last line (page_cache_sync_readahead()) is unique, this is the function that is called after pagecache_get_page() couldn't find the page in the page cache. This is where the kernel is starting to organize reading the page from disk. The function generic_file_buffer_read() is where this is happening.

Next we go through:

  • page_cache_sync_readahead()
  • ondemand_readahead()
  • __do_page_cache_readahead

Before we get to the function read_pages(). These functions deal with, as you might have guessed, readahead. The linux kernel can and will dynamically try to perform readahead synchronously, which means by increasing the size of the read request, and asynchronously, which means by scheduling an IO request for logically following blocks using a kernel thread. It does that if the heuristics for it are triggered. Postgresql, and any other application that scans a large files sequentially, benefits from this.

The next function is read_pages(). The read_pages() function uses the concept of the IO device queue 'plug' using the function blk_start_plug(). What this does is pretty much alike a plug in a bathtub, but now for IO requests: it plugs the IO request list, then produces IO requests one by one based on the list of pages that the function has been given, and once it has finished the list, the IO scheduler is run. This batching allows the IO scheduler to assess the IO requests, reorder requests if they are out of order, merge IO requests if the requested block ranges are adjacent, etc.

Then xfs_vm_readpages()>iomap_readpages() is called. This function builds an iomap which later is used to submit IO requests to the device.

Inside iomap_readpages() first iomap_apply() is called. This function creates the iomap that contains the pages that will be requested. The function xfs_read_iomap_begin() is the start of the iomap creation for the requests, which briefly needs to lock the inode to read the xfs blocks map. The function iomap_readpages_actor() is a recursive function that adds pages to the iomap.

The next and last function function in iomap_readpages() is submit_bio(). This function takes the iomap and submits the request to the block device layer. Mind you the device queue is still plugged at this point in time. You can see the IO scheduler in action with functions such as elv_merge().

At this point the IO is scheduled in the scheduler scatter-gather list, but still plugged. At this point no IO request has been sent to the device yet. The execution returns from xfs_vm_readpages(), and the pages that held the iomap are freed.

At this point blk_finish_plug() is executed. The purpose of this function is obvious: to unblock the list of requests for the device and submit these to the physical device. Inside this function, the requests are submitted.

The main actual function here depends on the hardware, in my case the blk_mq_do_dispatch_sched() function mainly is responsible for dispatching requests to the device.

Once all the scheduled requests have been submitted, the blk_finish_plug() function is done.

At this point, the IOs have been submitted, and now the session has to wait for the IOs to be become available. The way this happens for synchronous IO calls is quite generic in many operating systems: after the IO requests have been submitted to the physical device, the process schedules itself off CPU and waits for an interrupt from the physical device indicating the IO requests are done. Therefore, it calls io_schedule() and then schedule(). The process schedules itself off CPU in the state 'D' (uninterruptible sleep), and can only be woken by the interrupt. It even can't be 'killed' (signal 9; kill -9).

It's not common to see processes in state 'D'. The reason for that might not be what you think. The reason is that performing the work for making the requests is done in the state 'R' (running), this is what you can and will see with utilities like 'top'. Only once the work is done the task is given the state 'D', and scheduled off CPU. Most tools show processes/tasks on CPU, which a task in the task state 'D' only is very briefly; it spends most of it's time in state 'D' off CPU, and therefore "invisible" to tools like 'top'.

 3)    pr-5848     |   0.142 us    |                      copy_fpregs_to_fpstate();
 3)    pr-5848     |   0.156 us    |                      finish_task_switch();
 3)    pr-5848     | ! 465.876 us  |                    } /* schedule */
 3)    pr-5848     | ! 466.622 us  |                  } /* io_schedule */
 3)    pr-5848     |   0.053 us    |                  finish_wait();
Enter fullscreen mode Exit fullscreen mode

At a certain point in time the block device will signal the system that is has finished processing the requests, which is done via an interrupt. This is what will wake up a process that has scheduled itself off CPU waiting for its IO requests.

In the above snippet of the execution this is visible with the return of the schedule function.

After returning, the requested page is in the cache, and is accessed using the pagecace_get_page() function, the inode is unlocked (xfs_iunlock()), and the call is returned.

To browse through the full function call flow, see below.

Complete physical IO trace

 3)    pr-5848     |               |  ksys_pread64() {
 3)    pr-5848     |               |    __fdget() {
 3)    pr-5848     |   0.087 us    |      __fget_light();
 3)    pr-5848     |   0.561 us    |    } /* __fdget */
 3)    pr-5848     |               |    vfs_read() {
 3)    pr-5848     |               |      rw_verify_area() {
 3)    pr-5848     |               |        security_file_permission() {
 3)    pr-5848     |               |          selinux_file_permission() {
 3)    pr-5848     |               |            __inode_security_revalidate() {
 3)    pr-5848     |               |              _cond_resched() {
 3)    pr-5848     |   0.040 us    |                rcu_all_qs();
 3)    pr-5848     |   0.352 us    |              } /* _cond_resched */
 3)    pr-5848     |   0.670 us    |            } /* __inode_security_revalidate */
 3)    pr-5848     |   0.040 us    |            avc_policy_seqno();
 3)    pr-5848     |   1.333 us    |          } /* selinux_file_permission */
 3)    pr-5848     |   0.040 us    |          bpf_lsm_file_permission();
 3)    pr-5848     |   0.041 us    |          __fsnotify_parent();
 3)    pr-5848     |   0.046 us    |          fsnotify();
 3)    pr-5848     |   2.710 us    |        } /* security_file_permission */
 3)    pr-5848     |   3.043 us    |      } /* rw_verify_area */
 3)    pr-5848     |               |      __vfs_read() {
 3)    pr-5848     |               |        new_sync_read() {
 3)    pr-5848     |               |          xfs_file_read_iter [xfs]() {
 3)    pr-5848     |               |            xfs_file_buffered_aio_read [xfs]() {
 3)    pr-5848     |               |              xfs_ilock [xfs]() {
 3)    pr-5848     |               |                down_read() {
 3)    pr-5848     |               |                  _cond_resched() {
 3)    pr-5848     |   0.039 us    |                    rcu_all_qs();
 3)    pr-5848     |   0.353 us    |                  } /* _cond_resched */
 3)    pr-5848     |   0.666 us    |                } /* down_read */
 3)    pr-5848     |   1.004 us    |              } /* xfs_ilock [xfs] */
 3)    pr-5848     |               |              generic_file_read_iter() {
 3)    pr-5848     |               |                generic_file_buffered_read() {
 3)    pr-5848     |               |                  _cond_resched() {
 3)    pr-5848     |   0.040 us    |                    rcu_all_qs();
 3)    pr-5848     |   0.352 us    |                  } /* _cond_resched */
 3)    pr-5848     |               |                  pagecache_get_page() {
 3)    pr-5848     |   0.115 us    |                    find_get_entry();
 3)    pr-5848     |   0.463 us    |                  } /* pagecache_get_page */
 3)    pr-5848     |               |                  page_cache_sync_readahead() {
 3)    pr-5848     |   0.040 us    |                    kthread_blkcg();
 3)    pr-5848     |               |                    ondemand_readahead() {
 3)    pr-5848     |               |                      __do_page_cache_readahead() {
 3)    pr-5848     |               |                        __page_cache_alloc() {
 3)    pr-5848     |               |                          alloc_pages_current() {
 3)    pr-5848     |   0.059 us    |                            get_task_policy.part.34();
 3)    pr-5848     |   0.039 us    |                            policy_nodemask();
 3)    pr-5848     |   0.040 us    |                            policy_node();
 3)    pr-5848     |               |                            __alloc_pages_nodemask() {
 3)    pr-5848     |               |                              _cond_resched() {
 3)    pr-5848     |   0.040 us    |                                rcu_all_qs();
 3)    pr-5848     |   0.356 us    |                              } /* _cond_resched */
 3)    pr-5848     |   0.041 us    |                              should_fail_alloc_page();
 3)    pr-5848     |               |                              get_page_from_freelist() {
 3)    pr-5848     |   0.043 us    |                                __inc_numa_state();
 3)    pr-5848     |   0.042 us    |                                __inc_numa_state();
 3)    pr-5848     |   0.055 us    |                                prep_new_page();
 3)    pr-5848     |   1.142 us    |                              } /* get_page_from_freelist */
 3)    pr-5848     |   2.429 us    |                            } /* __alloc_pages_nodemask */
 3)    pr-5848     |   3.741 us    |                          } /* alloc_pages_current */
 3)    pr-5848     |   4.057 us    |                        } /* __page_cache_alloc */
 3)    pr-5848     |               |                        read_pages() {
 3)    pr-5848     |   0.056 us    |                          blk_start_plug();
 3)    pr-5848     |               |                          xfs_vm_readpages [xfs]() {
 3)    pr-5848     |               |                            iomap_readpages() {
 3)    pr-5848     |               |                              iomap_apply() {
 3)    pr-5848     |               |                                xfs_read_iomap_begin [xfs]() {
 3)    pr-5848     |               |                                  xfs_ilock_for_iomap [xfs]() {
 3)    pr-5848     |               |                                    xfs_ilock [xfs]() {
 3)    pr-5848     |               |                                      down_read() {
 3)    pr-5848     |               |                                        _cond_resched() {
 3)    pr-5848     |   0.040 us    |                                          rcu_all_qs();
 3)    pr-5848     |   0.354 us    |                                        } /* _cond_resched */
 3)    pr-5848     |   0.722 us    |                                      } /* down_read */
 3)    pr-5848     |   1.054 us    |                                    } /* xfs_ilock [xfs] */
 3)    pr-5848     |   1.425 us    |                                  } /* xfs_ilock_for_iomap [xfs] */
 3)    pr-5848     |               |                                  xfs_bmapi_read [xfs]() {
 3)    pr-5848     |   0.191 us    |                                    xfs_iext_lookup_extent [xfs]();
 3)    pr-5848     |   0.043 us    |                                    xfs_bmapi_trim_map.isra.16 [xfs]();
 3)    pr-5848     |   0.053 us    |                                    xfs_bmapi_update_map.isra.17 [xfs]();
 3)    pr-5848     |   1.260 us    |                                  } /* xfs_bmapi_read [xfs] */
 3)    pr-5848     |               |                                  xfs_iunlock [xfs]() {
 3)    pr-5848     |   0.040 us    |                                    up_read();
 3)    pr-5848     |   0.382 us    |                                  } /* xfs_iunlock [xfs] */
 3)    pr-5848     |               |                                  xfs_bmbt_to_iomap [xfs]() {
 3)    pr-5848     |   0.040 us    |                                    xfs_fsb_to_db [xfs]();
 3)    pr-5848     |   0.488 us    |                                  } /* xfs_bmbt_to_iomap [xfs] */
 3)    pr-5848     |   4.786 us    |                                } /* xfs_read_iomap_begin [xfs] */
 3)    pr-5848     |               |                                iomap_readpages_actor() {
 3)    pr-5848     |               |                                  add_to_page_cache_lru() {
 3)    pr-5848     |               |                                    __add_to_page_cache_locked() {
 3)    pr-5848     |   0.050 us    |                                      PageHuge();
 3)    pr-5848     |   0.038 us    |                                      shmem_mapping();
 3)    pr-5848     |               |                                      mem_cgroup_charge() {
 3)    pr-5848     |   0.051 us    |                                        get_mem_cgroup_from_mm();
 3)    pr-5848     |   0.061 us    |                                        try_charge();
 3)    pr-5848     |               |                                        mem_cgroup_charge_statistics.isra.74() {
 3)    pr-5848     |   0.040 us    |                                          __count_memcg_events.part.73();
 3)    pr-5848     |   0.354 us    |                                        } /* mem_cgroup_charge_statistics.isra.74 */
 3)    pr-5848     |   0.040 us    |                                        memcg_check_events();
 3)    pr-5848     |   1.673 us    |                                      } /* mem_cgroup_charge */
 3)    pr-5848     |   0.042 us    |                                      _raw_spin_lock_irq();
 3)    pr-5848     |               |                                      __mod_lruvec_page_state() {
 3)    pr-5848     |               |                                        __mod_lruvec_state() {
 3)    pr-5848     |   0.042 us    |                                          __mod_node_page_state();
 3)    pr-5848     |               |                                          __mod_memcg_lruvec_state() {
 3)    pr-5848     |   0.054 us    |                                            __mod_memcg_state.part.71();
 3)    pr-5848     |   0.399 us    |                                          } /* __mod_memcg_lruvec_state */
 3)    pr-5848     |   1.024 us    |                                        } /* __mod_lruvec_state */
 3)    pr-5848     |   1.361 us    |                                      } /* __mod_lruvec_page_state */
 3)    pr-5848     |   5.755 us    |                                    } /* __add_to_page_cache_locked */
 3)    pr-5848     |   0.040 us    |                                    lru_cache_add();
 3)    pr-5848     |   6.446 us    |                                  } /* add_to_page_cache_lru */
 3)    pr-5848     |               |                                  iomap_readpage_actor() {
 3)    pr-5848     |   0.040 us    |                                    iomap_page_create();
 3)    pr-5848     |   0.064 us    |                                    iomap_adjust_read_range();
 3)    pr-5848     |               |                                    bio_alloc_bioset() {
 3)    pr-5848     |               |                                      mempool_alloc() {
 3)    pr-5848     |               |                                        _cond_resched() {
 3)    pr-5848     |   0.039 us    |                                          rcu_all_qs();
 3)    pr-5848     |   0.352 us    |                                        } /* _cond_resched */
 3)    pr-5848     |               |                                        mempool_alloc_slab() {
 3)    pr-5848     |               |                                          kmem_cache_alloc() {
 3)    pr-5848     |   0.040 us    |                                            should_failslab();
 3)    pr-5848     |   0.501 us    |                                          } /* kmem_cache_alloc */
 3)    pr-5848     |   0.826 us    |                                        } /* mempool_alloc_slab */
 3)    pr-5848     |   1.827 us    |                                      } /* mempool_alloc */
 3)    pr-5848     |   0.052 us    |                                      bio_init();
 3)    pr-5848     |   2.578 us    |                                    } /* bio_alloc_bioset */
 3)    pr-5848     |               |                                    bio_associate_blkg() {
 3)    pr-5848     |   0.042 us    |                                      kthread_blkcg();
 3)    pr-5848     |   0.276 us    |                                      bio_associate_blkg_from_css();
 3)    pr-5848     |   0.932 us    |                                    } /* bio_associate_blkg */
 3)    pr-5848     |   0.040 us    |                                    __bio_add_page();
 3)    pr-5848     |   5.267 us    |                                  } /* iomap_readpage_actor */
 3)    pr-5848     | + 12.394 us   |                                } /* iomap_readpages_actor */
 3)    pr-5848     | + 17.914 us   |                              } /* iomap_apply */
 3)    pr-5848     |               |                              submit_bio() {
 3)    pr-5848     |               |                                generic_make_request() {
 3)    pr-5848     |               |                                  generic_make_request_checks() {
 3)    pr-5848     |               |                                    _cond_resched() {
 3)    pr-5848     |   0.039 us    |                                      rcu_all_qs();
 3)    pr-5848     |   0.350 us    |                                    } /* _cond_resched */
 3)    pr-5848     |   0.040 us    |                                    should_fail_bio();
 3)    pr-5848     |   0.185 us    |                                    blk_throtl_bio();
 3)    pr-5848     |   0.042 us    |                                    blk_cgroup_bio_start();
 3)    pr-5848     |   0.133 us    |                                    ktime_get();
 3)    pr-5848     |   2.376 us    |                                  } /* generic_make_request_checks */
 3)    pr-5848     |   0.090 us    |                                  blk_queue_enter();
 3)    pr-5848     |               |                                  dm_make_request [dm_mod]() {
 3)    pr-5848     |   0.100 us    |                                    __srcu_read_lock();
 3)    pr-5848     |               |                                    bio_alloc_bioset() {
 3)    pr-5848     |               |                                      mempool_alloc() {
 3)    pr-5848     |               |                                        _cond_resched() {
 3)    pr-5848     |   0.039 us    |                                          rcu_all_qs();
 3)    pr-5848     |   0.352 us    |                                        } /* _cond_resched */
 3)    pr-5848     |               |                                        mempool_alloc_slab() {
 3)    pr-5848     |               |                                          kmem_cache_alloc() {
 3)    pr-5848     |   0.042 us    |                                            should_failslab();
 3)    pr-5848     |   0.485 us    |                                          } /* kmem_cache_alloc */
 3)    pr-5848     |   0.807 us    |                                        } /* mempool_alloc_slab */
 3)    pr-5848     |   1.776 us    |                                      } /* mempool_alloc */
 3)    pr-5848     |   0.052 us    |                                      bio_init();
 3)    pr-5848     |   2.458 us    |                                    } /* bio_alloc_bioset */
 3)    pr-5848     |               |                                    disk_start_io_acct() {
 3)    pr-5848     |               |                                      __part_start_io_acct() {
 3)    pr-5848     |   0.048 us    |                                        update_io_ticks();
 3)    pr-5848     |   0.410 us    |                                      } /* __part_start_io_acct */
 3)    pr-5848     |   0.734 us    |                                    } /* disk_start_io_acct */
 3)    pr-5848     |               |                                    __split_and_process_non_flush [dm_mod]() {
 3)    pr-5848     |   0.190 us    |                                      dm_table_find_target [dm_mod]();
 3)    pr-5848     |   0.050 us    |                                      max_io_len [dm_mod]();
 3)    pr-5848     |   0.043 us    |                                      alloc_tio.isra.38 [dm_mod]();
 3)    pr-5848     |               |                                      __bio_clone_fast() {
 3)    pr-5848     |   0.052 us    |                                        bio_clone_blkg_association();
 3)    pr-5848     |   0.066 us    |                                        ktime_get();
 3)    pr-5848     |   1.157 us    |                                      } /* __bio_clone_fast */
 3)    pr-5848     |   0.057 us    |                                      bio_advance();
 3)    pr-5848     |               |                                      __map_bio [dm_mod]() {
 3)    pr-5848     |               |                                        linear_map [dm_mod]() {
 3)    pr-5848     |               |                                          bio_associate_blkg() {
 3)    pr-5848     |   0.295 us    |                                            bio_associate_blkg_from_css();
 3)    pr-5848     |   0.612 us    |                                          } /* bio_associate_blkg */
 3)    pr-5848     |   1.074 us    |                                        } /* linear_map [dm_mod] */
 3)    pr-5848     |               |                                        generic_make_request() {
 3)    pr-5848     |               |                                          generic_make_request_checks() {
 3)    pr-5848     |               |                                            _cond_resched() {
 3)    pr-5848     |   0.039 us    |                                              rcu_all_qs();
 3)    pr-5848     |   0.362 us    |                                            } /* _cond_resched */
 3)    pr-5848     |   0.040 us    |                                            should_fail_bio();
 3)    pr-5848     |   0.098 us    |                                            __disk_get_part();
 3)    pr-5848     |   0.147 us    |                                            blk_throtl_bio();
 3)    pr-5848     |   0.061 us    |                                            blk_cgroup_bio_start();
 3)    pr-5848     |   0.062 us    |                                            ktime_get();
 3)    pr-5848     |   2.593 us    |                                          } /* generic_make_request_checks */
 3)    pr-5848     |   2.952 us    |                                        } /* generic_make_request */
 3)    pr-5848     |   4.743 us    |                                      } /* __map_bio [dm_mod] */
 3)    pr-5848     |   8.124 us    |                                    } /* __split_and_process_non_flush [dm_mod] */
 3)    pr-5848     |   0.040 us    |                                    errno_to_blk_status();
 3)    pr-5848     |   0.063 us    |                                    dec_pending [dm_mod]();
 3)    pr-5848     |   0.041 us    |                                    __srcu_read_unlock();
 3)    pr-5848     | + 13.730 us   |                                  } /* dm_make_request [dm_mod] */
 3)    pr-5848     |   0.046 us    |                                  blk_queue_enter();
 3)    pr-5848     |               |                                  blk_mq_make_request() {
 3)    pr-5848     |   0.050 us    |                                    blk_queue_bounce();
 3)    pr-5848     |   0.125 us    |                                    blk_queue_split();
 3)    pr-5848     |   0.066 us    |                                    bio_integrity_prep();
 3)    pr-5848     |   0.055 us    |                                    blk_attempt_plug_merge();
 3)    pr-5848     |               |                                    __blk_mq_sched_bio_merge() {
 3)    pr-5848     |               |                                      dd_bio_merge() {
 3)    pr-5848     |   0.099 us    |                                        _raw_spin_lock();
 3)    pr-5848     |               |                                        blk_mq_sched_try_merge() {
 3)    pr-5848     |               |                                          elv_merge() {
 3)    pr-5848     |   0.039 us    |                                            elv_rqhash_find();
 3)    pr-5848     |               |                                            dd_request_merge() {
 3)    pr-5848     |   0.040 us    |                                              elv_rb_find();
 3)    pr-5848     |   0.386 us    |                                            } /* dd_request_merge */
 3)    pr-5848     |   1.092 us    |                                          } /* elv_merge */
 3)    pr-5848     |   1.448 us    |                                        } /* blk_mq_sched_try_merge */
 3)    pr-5848     |   2.223 us    |                                      } /* dd_bio_merge */
 3)    pr-5848     |   2.672 us    |                                    } /* __blk_mq_sched_bio_merge */
 3)    pr-5848     |               |                                    __rq_qos_throttle() {
 3)    pr-5848     |               |                                      wbt_wait() {
 3)    pr-5848     |   0.052 us    |                                        bio_to_wbt_flags.isra.20.part.21();
 3)    pr-5848     |   0.400 us    |                                      } /* wbt_wait */
 3)    pr-5848     |   0.040 us    |                                      blkcg_iolatency_throttle();
 3)    pr-5848     |   1.289 us    |                                    } /* __rq_qos_throttle */
 3)    pr-5848     |               |                                    __blk_mq_alloc_request() {
 3)    pr-5848     |               |                                      blk_mq_get_tag() {
 3)    pr-5848     |   0.211 us    |                                        __blk_mq_get_tag();
 3)    pr-5848     |   0.560 us    |                                      } /* blk_mq_get_tag */
 3)    pr-5848     |               |                                      blk_mq_rq_ctx_init.isra.46() {
 3)    pr-5848     |   0.066 us    |                                        ktime_get();
 3)    pr-5848     |   0.039 us    |                                        dd_prepare_request();
 3)    pr-5848     |   0.881 us    |                                      } /* blk_mq_rq_ctx_init.isra.46 */
 3)    pr-5848     |   2.098 us    |                                    } /* __blk_mq_alloc_request */
 3)    pr-5848     |               |                                    __rq_qos_track() {
 3)    pr-5848     |               |                                      wbt_track() {
 3)    pr-5848     |   0.039 us    |                                        bio_to_wbt_flags.isra.20.part.21();
 3)    pr-5848     |   0.368 us    |                                      } /* wbt_track */
 3)    pr-5848     |   0.737 us    |                                    } /* __rq_qos_track */
 3)    pr-5848     |   0.063 us    |                                    blk_rq_bio_prep();
 3)    pr-5848     |               |                                    blk_account_io_start() {
 3)    pr-5848     |   0.055 us    |                                      disk_map_sector_rcu();
 3)    pr-5848     |   0.072 us    |                                      update_io_ticks();
 3)    pr-5848     |   0.754 us    |                                    } /* blk_account_io_start */
 3)    pr-5848     |   0.075 us    |                                    blk_add_rq_to_plug();
 3)    pr-5848     | + 11.588 us   |                                  } /* blk_mq_make_request */
 3)    pr-5848     | + 29.610 us   |                                } /* generic_make_request */
 3)    pr-5848     | + 30.019 us   |                              } /* submit_bio */
 3)    pr-5848     | + 48.684 us   |                            } /* iomap_readpages */
 3)    pr-5848     | + 49.049 us   |                          } /* xfs_vm_readpages [xfs] */
 3)    pr-5848     |   0.044 us    |                          put_pages_list();
 3)    pr-5848     |               |                          blk_finish_plug() {
 3)    pr-5848     |               |                            blk_flush_plug_list() {
 3)    pr-5848     |               |                              blk_mq_flush_plug_list() {
 3)    pr-5848     |               |                                blk_mq_sched_insert_requests() {
 3)    pr-5848     |               |                                  dd_insert_requests() {
 3)    pr-5848     |   0.042 us    |                                    _raw_spin_lock();
 3)    pr-5848     |               |                                    blk_mq_sched_try_insert_merge() {
 3)    pr-5848     |               |                                      elv_attempt_insert_merge() {
 3)    pr-5848     |   0.039 us    |                                        elv_rqhash_find();
 3)    pr-5848     |   0.371 us    |                                      } /* elv_attempt_insert_merge */
 3)    pr-5848     |   0.699 us    |                                    } /* blk_mq_sched_try_insert_merge */
 3)    pr-5848     |   0.043 us    |                                    blk_mq_sched_request_inserted();
 3)    pr-5848     |   0.075 us    |                                    elv_rb_add();
 3)    pr-5848     |   0.043 us    |                                    elv_rqhash_add();
 3)    pr-5848     |   2.470 us    |                                  } /* dd_insert_requests */
 3)    pr-5848     |               |                                  blk_mq_run_hw_queue() {
 3)    pr-5848     |   0.047 us    |                                    dd_has_work();
 3)    pr-5848     |               |                                    __blk_mq_delay_run_hw_queue() {
 3)    pr-5848     |               |                                      __blk_mq_run_hw_queue() {
 3)    pr-5848     |               |                                        blk_mq_sched_dispatch_requests() {
 3)    pr-5848     |               |                                          __blk_mq_sched_dispatch_requests() {
 3)    pr-5848     |               |                                            __blk_mq_do_dispatch_sched() {
 3)    pr-5848     |   0.040 us    |                                              dd_has_work();
 3)    pr-5848     |   0.214 us    |                                              scsi_mq_get_budget();
 3)    pr-5848     |               |                                              dd_dispatch_request() {
 3)    pr-5848     |   0.040 us    |                                                _raw_spin_lock();
 3)    pr-5848     |               |                                                deadline_remove_request() {
 3)    pr-5848     |   0.059 us    |                                                  elv_rb_del();
 3)    pr-5848     |   0.050 us    |                                                  elv_rqhash_del();
 3)    pr-5848     |   0.729 us    |                                                } /* deadline_remove_request */
 3)    pr-5848     |   1.462 us    |                                              } /* dd_dispatch_request */
 3)    pr-5848     |   0.039 us    |                                              scsi_mq_set_rq_budget_token();
 3)    pr-5848     |   0.227 us    |                                              blk_mq_get_driver_tag();
 3)    pr-5848     |   0.055 us    |                                              dd_has_work();
 3)    pr-5848     |               |                                              blk_mq_dispatch_rq_list() {
 3)    pr-5848     |   0.050 us    |                                                blk_mq_get_driver_tag();
 3)    pr-5848     |               |                                                scsi_queue_rq() {
 3)    pr-5848     |               |                                                  scsi_init_command() {
 3)    pr-5848     |   0.041 us    |                                                    scsi_req_init();
 3)    pr-5848     |   0.038 us    |                                                    init_timer_key();
 3)    pr-5848     |   0.041 us    |                                                    scsi_add_cmd_to_list();
 3)    pr-5848     |   1.238 us    |                                                  } /* scsi_init_command */
 3)    pr-5848     |               |                                                  blk_mq_start_request() {
 3)    pr-5848     |   0.068 us    |                                                    ktime_get();
 3)    pr-5848     |               |                                                    __rq_qos_issue() {
 3)    pr-5848     |   0.039 us    |                                                      wbt_issue();
 3)    pr-5848     |   0.410 us    |                                                    } /* __rq_qos_issue */
 3)    pr-5848     |   0.040 us    |                                                    blk_add_timer();
 3)    pr-5848     |   1.498 us    |                                                  } /* blk_mq_start_request */
 3)    pr-5848     |               |                                                  sd_init_command [sd_mod]() {
 3)    pr-5848     |               |                                                    scsi_alloc_sgtables() {
 3)    pr-5848     |   0.090 us    |                                                      __blk_rq_map_sg();
 3)    pr-5848     |   0.534 us    |                                                    } /* scsi_alloc_sgtables */
 3)    pr-5848     |   1.068 us    |                                                  } /* sd_init_command [sd_mod] */
 3)    pr-5848     |   0.039 us    |                                                  scsi_log_send();
 3)    pr-5848     |               |                                                  ata_scsi_queuecmd [libata]() {
 3)    pr-5848     |   0.117 us    |                                                    _raw_spin_lock_irqsave();
 3)    pr-5848     |               |                                                    ata_scsi_find_dev [libata]() {
 3)    pr-5848     |               |                                                      __ata_scsi_find_dev [libata]() {
 3)    pr-5848     |   0.057 us    |                                                        ata_find_dev [libata]();
 3)    pr-5848     |   0.384 us    |                                                      } /* __ata_scsi_find_dev [libata] */
 3)    pr-5848     |   0.696 us    |                                                    } /* ata_scsi_find_dev [libata] */
 3)    pr-5848     |               |                                                    ata_scsi_translate [libata]() {
 3)    pr-5848     |   0.107 us    |                                                      ata_qc_new_init [libata]();
 3)    pr-5848     |   0.040 us    |                                                      ata_sg_init [libata]();
 3)    pr-5848     |               |                                                      ata_scsi_rw_xlat [libata]() {
 3)    pr-5848     |   0.093 us    |                                                        ata_build_rw_tf [libata]();
 3)    pr-5848     |   0.482 us    |                                                      } /* ata_scsi_rw_xlat [libata] */
 3)    pr-5848     |               |                                                      ahci_pmp_qc_defer [libahci]() {
 3)    pr-5848     |   0.063 us    |                                                        ata_std_qc_defer [libata]();
 3)    pr-5848     |   0.385 us    |                                                      } /* ahci_pmp_qc_defer [libahci] */
 3)    pr-5848     |               |                                                      ata_qc_issue [libata]() {
 3)    pr-5848     |               |                                                        dma_map_sg_attrs() {
 3)    pr-5848     |   0.086 us    |                                                          dma_direct_map_sg();
 3)    pr-5848     |   0.500 us    |                                                        } /* dma_map_sg_attrs */
 3)    pr-5848     |               |                                                        ahci_qc_prep [libahci]() {
 3)    pr-5848     |   0.074 us    |                                                          ata_tf_to_fis [libata]();
 3)    pr-5848     |   0.510 us    |                                                        } /* ahci_qc_prep [libahci] */
 3)    pr-5848     | + 13.155 us   |                                                        ahci_qc_issue [libahci]();
 3)    pr-5848     | + 15.299 us   |                                                      } /* ata_qc_issue [libata] */
 3)    pr-5848     | + 18.050 us   |                                                    } /* ata_scsi_translate [libata] */
 3)    pr-5848     |   0.062 us    |                                                    _raw_spin_unlock_irqrestore();
 3)    pr-5848     | + 20.357 us   |                                                  } /* ata_scsi_queuecmd [libata] */
 3)    pr-5848     | + 26.123 us   |                                                } /* scsi_queue_rq */
 3)    pr-5848     | + 26.976 us   |                                              } /* blk_mq_dispatch_rq_list */
 3)    pr-5848     | + 31.596 us   |                                            } /* __blk_mq_do_dispatch_sched */
 3)    pr-5848     |               |                                            __blk_mq_do_dispatch_sched() {
 3)    pr-5848     |   0.041 us    |                                              dd_has_work();
 3)    pr-5848     |   0.535 us    |                                            } /* __blk_mq_do_dispatch_sched */
 3)    pr-5848     | + 32.795 us   |                                          } /* __blk_mq_sched_dispatch_requests */
 3)    pr-5848     | + 33.121 us   |                                        } /* blk_mq_sched_dispatch_requests */
 3)    pr-5848     | + 33.478 us   |                                      } /* __blk_mq_run_hw_queue */
 3)    pr-5848     | + 33.907 us   |                                    } /* __blk_mq_delay_run_hw_queue */
 3)    pr-5848     | + 34.645 us   |                                  } /* blk_mq_run_hw_queue */
 3)    pr-5848     | + 37.767 us   |                                } /* blk_mq_sched_insert_requests */
 3)    pr-5848     | + 38.153 us   |                              } /* blk_mq_flush_plug_list */
 3)    pr-5848     | + 38.557 us   |                            } /* blk_flush_plug_list */
 3)    pr-5848     | + 39.060 us   |                          } /* blk_finish_plug */
 3)    pr-5848     | + 89.614 us   |                        } /* read_pages */
 3)    pr-5848     | + 94.473 us   |                      } /* __do_page_cache_readahead */
 3)    pr-5848     | + 94.838 us   |                    } /* ondemand_readahead */
 3)    pr-5848     | + 95.539 us   |                  } /* page_cache_sync_readahead */
 3)    pr-5848     |               |                  pagecache_get_page() {
 3)    pr-5848     |               |                    find_get_entry() {
 3)    pr-5848     |   0.050 us    |                      PageHuge();
 3)    pr-5848     |   0.423 us    |                    } /* find_get_entry */
 3)    pr-5848     |   0.787 us    |                  } /* pagecache_get_page */
 3)    pr-5848     |   0.167 us    |                  _raw_spin_lock_irq();
 3)    pr-5848     |               |                  io_schedule() {
 3)    pr-5848     |   0.051 us    |                    io_schedule_prepare();
 3)    pr-5848     |               |                    schedule() {
 3)    pr-5848     |               |                      rcu_note_context_switch() {
 3)    pr-5848     |   0.046 us    |                        rcu_qs();
 3)    pr-5848     |   0.399 us    |                      } /* rcu_note_context_switch */
 3)    pr-5848     |   0.056 us    |                      _raw_spin_lock();
 3)    pr-5848     |               |                      update_rq_clock() {
 3)    pr-5848     |   0.056 us    |                        update_irq_load_avg();
 3)    pr-5848     |   0.495 us    |                      } /* update_rq_clock */
 3)    pr-5848     |               |                      dequeue_task_fair() {
 3)    pr-5848     |               |                        dequeue_entity() {
 3)    pr-5848     |               |                          update_curr() {
 3)    pr-5848     |   0.059 us    |                            update_min_vruntime();
 3)    pr-5848     |   0.152 us    |                            cpuacct_charge();
 3)    pr-5848     |   0.956 us    |                          } /* update_curr */
 3)    pr-5848     |               |                          __update_load_avg_se() {
 3)    pr-5848     |   0.041 us    |                            __accumulate_pelt_segments();
 3)    pr-5848     |   0.457 us    |                          } /* __update_load_avg_se */
 3)    pr-5848     |               |                          __update_load_avg_cfs_rq() {
 3)    pr-5848     |   0.039 us    |                            __accumulate_pelt_segments();
 3)    pr-5848     |   0.438 us    |                          } /* __update_load_avg_cfs_rq */
 3)    pr-5848     |   0.049 us    |                          clear_buddies();
 3)    pr-5848     |   0.041 us    |                          update_cfs_group();
 3)    pr-5848     |   0.058 us    |                          update_min_vruntime();
 3)    pr-5848     |   3.931 us    |                        } /* dequeue_entity */
 3)    pr-5848     |   0.039 us    |                        hrtick_update();
 3)    pr-5848     |   4.633 us    |                      } /* dequeue_task_fair */
 3)    pr-5848     |               |                      __delayacct_blkio_start() {
 3)    pr-5848     |   0.121 us    |                        ktime_get();
 3)    pr-5848     |   0.464 us    |                      } /* __delayacct_blkio_start */
 3)    pr-5848     |               |                      pick_next_task_fair() {
 3)    pr-5848     |               |                        newidle_balance() {
 3)    pr-5848     |               |                          update_blocked_averages() {
 3)    pr-5848     |   0.041 us    |                            _raw_spin_lock_irqsave();
 3)    pr-5848     |   0.081 us    |                            update_rq_clock();
 3)    pr-5848     |   0.094 us    |                            update_rt_rq_load_avg();
 3)    pr-5848     |   0.095 us    |                            update_dl_rq_load_avg();
 3)    pr-5848     |   0.103 us    |                            update_irq_load_avg();
 3)    pr-5848     |   0.060 us    |                            __update_load_avg_cfs_rq();
 3)    pr-5848     |   0.063 us    |                            _raw_spin_unlock_irqrestore();
 3)    pr-5848     |   2.711 us    |                          } /* update_blocked_averages */
 3)    pr-5848     |               |                          load_balance() {
 3)    pr-5848     |               |                            find_busiest_group() {
 3)    pr-5848     |   0.070 us    |                              update_nohz_stats();
 3)    pr-5848     |   0.040 us    |                              idle_cpu();
 3)    pr-5848     |               |                              update_nohz_stats() {
 3)    pr-5848     |               |                                update_blocked_averages() {
 3)    pr-5848     |   0.042 us    |                                  _raw_spin_lock_irqsave();
 3)    pr-5848     |               |                                  update_rq_clock() {
 3)    pr-5848     |               |                                    update_irq_load_avg() {
 3)    pr-5848     |   0.040 us    |                                      __accumulate_pelt_segments();
 3)    pr-5848     |   0.504 us    |                                    } /* update_irq_load_avg */
 3)    pr-5848     |   0.915 us    |                                  } /* update_rq_clock */
 3)    pr-5848     |   0.076 us    |                                  update_rt_rq_load_avg();
 3)    pr-5848     |   0.077 us    |                                  update_dl_rq_load_avg();
 3)    pr-5848     |   0.060 us    |                                  update_irq_load_avg();
 3)    pr-5848     |   0.114 us    |                                  __update_load_avg_cfs_rq();
 3)    pr-5848     |   0.060 us    |                                  _raw_spin_unlock_irqrestore();
 3)    pr-5848     |   3.518 us    |                                } /* update_blocked_averages */
 3)    pr-5848     |   3.902 us    |                              } /* update_nohz_stats */
 3)    pr-5848     |   0.053 us    |                              idle_cpu();
 3)    pr-5848     |               |                              update_nohz_stats() {
 3)    pr-5848     |               |                                update_blocked_averages() {
 3)    pr-5848     |   0.042 us    |                                  _raw_spin_lock_irqsave();
 3)    pr-5848     |   0.130 us    |                                  update_rq_clock();
 3)    pr-5848     |   0.070 us    |                                  update_rt_rq_load_avg();
 3)    pr-5848     |   0.066 us    |                                  update_dl_rq_load_avg();
 3)    pr-5848     |   0.103 us    |                                  update_irq_load_avg();
 3)    pr-5848     |   0.087 us    |                                  __update_load_avg_cfs_rq();
 3)    pr-5848     |   0.062 us    |                                  _raw_spin_unlock_irqrestore();
 3)    pr-5848     |   2.566 us    |                                } /* update_blocked_averages */
 3)    pr-5848     |   2.919 us    |                              } /* update_nohz_stats */
 3)    pr-5848     |   0.044 us    |                              idle_cpu();
 3)    pr-5848     |               |                              update_nohz_stats() {
 3)    pr-5848     |               |                                update_blocked_averages() {
 3)    pr-5848     |   0.073 us    |                                  _raw_spin_lock_irqsave();
 3)    pr-5848     |   0.092 us    |                                  update_rq_clock();
 3)    pr-5848     |   0.079 us    |                                  update_rt_rq_load_avg();
 3)    pr-5848     |   0.063 us    |                                  update_dl_rq_load_avg();
 3)    pr-5848     |   0.085 us    |                                  update_irq_load_avg();
 3)    pr-5848     |   0.063 us    |                                  __update_load_avg_cfs_rq();
 3)    pr-5848     |   0.062 us    |                                  _raw_spin_unlock_irqrestore();
 3)    pr-5848     |   2.537 us    |                                } /* update_blocked_averages */
 3)    pr-5848     |   2.881 us    |                              } /* update_nohz_stats */
 3)    pr-5848     |   0.039 us    |                              idle_cpu();
 3)    pr-5848     | + 12.912 us   |                            } /* find_busiest_group */
 3)    pr-5848     | + 13.427 us   |                          } /* load_balance */
 3)    pr-5848     |   0.040 us    |                          __msecs_to_jiffies();
 3)    pr-5848     |   0.038 us    |                          _raw_spin_lock();
 3)    pr-5848     | + 17.782 us   |                        } /* newidle_balance */
 3)    pr-5848     | + 18.165 us   |                      } /* pick_next_task_fair */
 3)    pr-5848     |               |                      put_prev_task_fair() {
 3)    pr-5848     |               |                        put_prev_entity() {
 3)    pr-5848     |   0.043 us    |                          check_cfs_rq_runtime();
 3)    pr-5848     |   0.403 us    |                        } /* put_prev_entity */
 3)    pr-5848     |   0.738 us    |                      } /* put_prev_task_fair */
 3)    pr-5848     |               |                      pick_next_task_idle() {
 3)    pr-5848     |   0.039 us    |                        set_next_task_idle();
 3)    pr-5848     |   0.375 us    |                      } /* pick_next_task_idle */
 3)    pr-5848     |               |                      enter_lazy_tlb() {
 3)    pr-5848     |               |                        switch_mm() {
 3)    pr-5848     |               |                          switch_mm_irqs_off() {
 3)    pr-5848     |   0.594 us    |                            load_new_mm_cr3();
 3)    pr-5848     |   1.351 us    |                          } /* switch_mm_irqs_off */
 3)    pr-5848     |   1.691 us    |                        } /* switch_mm */
 3)    pr-5848     |   2.015 us    |                      } /* enter_lazy_tlb */
 3)    pr-5848     |   0.142 us    |                      copy_fpregs_to_fpstate();
 3)    pr-5848     |   0.156 us    |                      finish_task_switch();
 3)    pr-5848     | ! 465.876 us  |                    } /* schedule */
 3)    pr-5848     | ! 466.622 us  |                  } /* io_schedule */
 3)    pr-5848     |   0.053 us    |                  finish_wait();
 3)    pr-5848     |   0.069 us    |                  mark_page_accessed();
 3)    pr-5848     |               |                  _cond_resched() {
 3)    pr-5848     |   0.042 us    |                    rcu_all_qs();
 3)    pr-5848     |   0.358 us    |                  } /* _cond_resched */
 3)    pr-5848     |               |                  _cond_resched() {
 3)    pr-5848     |   0.041 us    |                    rcu_all_qs();
 3)    pr-5848     |   0.356 us    |                  } /* _cond_resched */
 3)    pr-5848     |               |                  pagecache_get_page() {
 3)    pr-5848     |               |                    find_get_entry() {
 3)    pr-5848     |   0.048 us    |                      PageHuge();
 3)    pr-5848     |   0.542 us    |                    } /* find_get_entry */
 3)    pr-5848     |   0.899 us    |                  } /* pagecache_get_page */
 3)    pr-5848     |               |                  touch_atime() {
 3)    pr-5848     |               |                    atime_needs_update() {
 3)    pr-5848     |               |                      current_time() {
 3)    pr-5848     |   0.095 us    |                        ktime_get_coarse_real_ts64();
 3)    pr-5848     |   0.066 us    |                        timestamp_truncate();
 3)    pr-5848     |   0.804 us    |                      } /* current_time */
 3)    pr-5848     |   1.317 us    |                    } /* atime_needs_update */
 3)    pr-5848     |   1.695 us    |                  } /* touch_atime */
 3)    pr-5848     | ! 571.940 us  |                } /* generic_file_buffered_read */
 3)    pr-5848     | ! 572.293 us  |              } /* generic_file_read_iter */
 3)    pr-5848     |               |              xfs_iunlock [xfs]() {
 3)    pr-5848     |   0.062 us    |                up_read();
 3)    pr-5848     |   0.454 us    |              } /* xfs_iunlock [xfs] */
 3)    pr-5848     | ! 575.148 us  |            } /* xfs_file_buffered_aio_read [xfs] */
 3)    pr-5848     | ! 575.728 us  |          } /* xfs_file_read_iter [xfs] */
 3)    pr-5848     | ! 576.240 us  |        } /* new_sync_read */
 3)    pr-5848     | ! 576.601 us  |      } /* __vfs_read */
 3)    pr-5848     |   0.042 us    |      __fsnotify_parent();
 3)    pr-5848     |   0.088 us    |      fsnotify();
 3)    pr-5848     | ! 581.125 us  |    } /* vfs_read */
 3)    pr-5848     | ! 582.583 us  |  } /* ksys_pread64 */
Enter fullscreen mode Exit fullscreen mode
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .