The Linux RT Benchmarking Framework (LRTBF) is a set of drivers and scripts for evaluating the performance of various real-time additions for the Linux kernel. Specifically, the LRTBF allows measuring the overall load imposed by the RT enhancement and its ability to deterministically respond to incoming interrupts. Initially, the LRTBF was used for evaluating Ingo Molnar's PREEMPT_RT patches and Philippe Gerum's I-pipe (previously part of Adeos.)
The LRTBF is released under the GNU GPL. You are encouraged to try it out, and discuss your results on the relevant mailing lists. Of course we welcome contributions and additions. All questions and contributions should be directed to the LRTBF maintainer.
Maintained by: Kristian Benoit (kbenoit@opersys.com)
A new release of the LRTBF is now available, with improved test automation.
2005/06/29: Take 3 test run releasedThe results of the 3rd test run are now available:
We haven't yet published the updated version of the LRTBF that was used to generate those results, but we will do so shortly.
2005/06/21: Initial releaseThis is the initial release of the LRTBF. It has already been used for publishing two sets of performance data evaluating Ingo Molnar's PREEMPT_RT and Philippe Gerum's I-pipe. We are providing the LRTBF in the hopes that others will be motivated to carry out their own tests and even extend the test-set to include more evaluation points.
The following postings were made to the Linux kernel mailing list containing data generated by the LRTBF. These postings have been generally very well received by the community as they help shed more light as to the pertinence of the most widely-publicized real-time extensions to the Linux kernel.
WARNING: Please do not forgot that the enhancements being evaluated here are all a works-in-progress. We therefore encourage you to read these numbers with an open mind.
These results were part of the 2005/07/08 testset.
+--------------------+-------+-------+-------+-------+-------+
| Kernel | plain | IRQ | ping | IRQ & | IRQ & |
| | | test | flood | ping | hd |
+====================+=======+=======+=======+=======+=======+
| Vanilla-2.6.12 | 152 s | 150 s | 188 s | 185 s | 239 s |
+====================+=======+=======+=======+=======+=======+
| with RT-V0.7.51-02 | 152 s | 153 s | 203 s | 201 s | 239 s |
+--------------------+-------+-------+-------+-------+-------+
| % | ~ | 2.0 | 8.0 | 8.6 | ~ |
+====================+=======+=======+=======+=======+=======+
| with Ipipe-0.7 | 149 s | 150 s | 193 s | 192 s | 236 s |
+--------------------+-------+-------+-------+-------+-------+
| % | -2.0 | ~ | 2.7 | 3.8 | -1.3 |
+--------------------+-------+-------+-------+-------+-------+
Legend:
plain = Nothing special
IRQ test = on logger: triggering target every 1ms
ping flood = on host: "sudo ping -f $TARGET_IP_ADDR"
IRQ & ping = combination of the previous two
IRQ & hd = IRQ test with the following being done on the target:
"while [ true ]
do dd if=/dev/zero of=/tmp/dummy count=512 bs=1m
done"
Data cumulated by LMbench:
"plain" run:
Measurements | Vanilla | preempt_rt | ipipe
---------------+-------------+----------------+-------------
fork | 97us | 91us (-6%) | 101us (+4%)
open/close | 2.8us | 2.9us (+3%) | 2.8us (~)
execve | 348us | 347us (~) | 356us (+2%)
select 500fd | 13.9us | 17.1us (+23%) | 13.9us (~)
mmap | 776us | 629us (-19%) | 794us (+2%)
pipe | 5.1us | 5.1us (~) | 5.4us (+6%)
"IRQ test" run:
Measurements | Vanilla | preempt_rt | ipipe
---------------+-------------+----------------+-------------
fork | 98us | 91us (-7%) | 100us (+2%)
open/close | 2.8us | 2.8us (~) | 2.8us (~)
execve | 349us | 349us (~) | 359us (+3%)
select 500fd | 13.9us | 17.2us (+24%) | 13.9us (~)
mmap | 774us | 630us (-19%) | 792us (+2%)
pipe | 5.0us | 5.0us (~) | 5.5us (+10%)
"ping flood" run:
Measurements | Vanilla | preempt_rt | ipipe
---------------+-------------+----------------+-------------
fork | 152us | 171us (+13%) | 165us (+9%)
open/close | 4.5us | 4.8us (+7%) | 4.8us (+7%)
execve | 550us | 663us (+21%) | 601us (+9%)
select 500fd | 20.9us | 29.4us (+41%) | 21.9us (+5%)
mmap | 1140us | 1122us (-2%) | 1257us (+10%)
pipe | 8.3us | 9.4us (+13%) | 10.2us (+23%)
"IRQ & ping" run:
Measurements | Vanilla | preempt_rt | ipipe
---------------+-------------+----------------+-------------
fork | 150us | 170us (+13%) | 160us (+7%)
open/close | 4.6us | 5.3us (+15%) | 4.8us (+4%)
execve | 512us | 629us (+23%) | 610us (+19%)
select 500fd | 20.9us | 30.6us (+46%) | 24.3us (+16%)
mmap | 1128us | 1083us (-4%) | 1264us (+12%)
pipe | 9.0us | 9.6us (+7%) | 9.6us (+7%)
"IRQ & hd" run:
Measurements | Vanilla | preempt_rt | ipipe
---------------+-------------+----------------+-------------
fork | 101us | 94us (-7%) | 103us (+2%)
open/close | 2.9us | 2.9us (~) | 3.0us (+3%)
execve | 366us | 370us (+1%) | 372us (+2%)
select 500fd | 14.3us | 18.1us (+27%) | 14.5us (+1%)
mmap | 794us | 654us (+18%) | 822us (+4%)
pipe | 6.3us | 6.5us (+3%) | 7.3us (+16%)
Interrupt response times:
+--------------------+------------+------+-------+------+--------+
| Kernel | sys load | Aver | Max | Min | StdDev |
+====================+============+======+=======+======+========+
| | None | 5.8 | 51.9 | 5.6 | 0.3 |
| | Ping | 5.8 | 49.1 | 5.6 | 0.8 |
| Vanilla-2.6.12 | lm. + ping | 6.1 | 53.3 | 5.6 | 1.1 |
| | lmbench | 6.1 | 77.9 | 5.6 | 0.8 |
| | lm. + hd | 6.5 | 128.4 | 5.6 | 3.4 |
| | DoHell | 6.8 | 555.6 | 5.6 | 7.2 |
+--------------------+------------+------+-------+------+--------+
| | None | 5.7 | 48.9 | 5.6 | 0.2 |
| | Ping | 7.0 | 62.0 | 5.6 | 1.5 |
| with RT-V0.7.51-02 | lm. + ping | 7.9 | 56.2 | 5.6 | 1.9 |
| | lmbench | 7.3 | 56.1 | 5.6 | 1.4 |
| | lm. + hd | 7.3 | 70.5 | 5.6 | 1.8 |
| | DoHell | 7.4 | 54.6 | 5.6 | 1.4 |
+--------------------+------------+------+-------+------+--------+
| | None | 7.2 | 47.6 | 5.7 | 1.9 |
| | Ping | 7.3 | 48.9 | 5.7 | 0.4 |
| with Ipipe-0.7 | lm.+ ping | 7.6 | 50.5 | 5.7 | 0.8 |
| | lmbench | 7.5 | 50.5 | 5.7 | 0.9 |
| | lm. + hd | 7.5 | 50.5 | 5.7 | 1.1 |
| | DoHell | 7.6 | 50.5 | 5.7 | 0.7 |
+--------------------+------------+------+-------+------+--------+
Here is the complete output of the LMbench runs, averaged on 5 runs using the lmbsum utility. Each output contains the summary for the vanilla kernel, the kernel with PREEMPT_RT, and the kernel with the I-pipe.
The following is a description of the use we made of LRTBF. There are certainly other configurations that can be used and additional tests that can be carried out. Feel free to customize the LRTBF to your needs, and send us back your modifications so that others can profit from your work.
In our tests, we've used a set up with 3 machines. The two main systems were Dell PowerEdge SC420 machines with a P4 2.8 (UP not SMP configured) with FC3. One had 256 MB RAM and was the guinea pig (i.e. the machine controlling the mechanical saw a.k.a. TARGET) The other, having 512 MB RAM, was used to collect data regarding the guinea pig's responsiveness (a.k.a LOGGER.) The third machine, an Apple PowerBook 5,6 G4 / 1GB, was used for a dual purpose. First, it controlled both the target and the logger via ssh, and was also used to ping flood the target. This 3rd system is known as the HOST.
Data was generated on all three systems:
Both the host and the logger had a constant kernel configuration. The logger was running an adeos-enabled kernel in order to trigger and deterministically measure the responsiveness of the target. The host was running a plain gentoo-based kernel. The target and the logger were rigged via their parallel ports so that an output from the logger would trigger an interrupt on the target who's response would itself trigger an interrupt on the logger.
In the various test runs, we've attempted to collect two sets of data. One regarding LMbench's total running time for a given set up and the other regarding the system's interrupt response time. Where appropriate, both tests were conducted simultaneously. Otherwise, they were conducted in isolation. The following tables should be self-explanatory.
For LMbench test runs, 3 passes were conducted and an average running time was collected. Certainly, 3 passes is not as much as we'd like, but for the immediate purposes, it provides a sufficiently corroborated data set for analysis (as can be seen in the following tables.)
For the interrupt response time measurement, the logger generated between 500,000 to 650,000 interrupts and measured the target's response time. The logger was not subject to any load whatsoever, except that imposed by the logging driver (running in a prioritary Adeos domain, and hence being truly hard-rt, a.k.a. "ruby" hard). Data was collected in a relayfs buffer and committed after all testing was complete only, hence no impact was generated by having the user-space daemon committing the data to storage. It could be argued that the use of Adeos imposes a penalty to the measured response time. However, this penalty is imposed on all data sets, and verification of its impact can be inferred by analyzing the adeos-to-adeos set up provided in the first set of results posted to the LKML.
Kristian Benoit
Karim Yaghmour