There's been quite some time since a stable release of LTT was made. During that time, quite a few features have been added and quite some effort has been put in trying to achieve kernel integration for the kernel trace points. Unfortunately, these efforts have, at the time of this writing, been unsuccessfull. The main reason given to the LTT developers by the kernel development community was that there were not enough Linux users requesting this type of tool for Linux.
The onus is therefore on LTT users to make the case with kernel developers that they need this tool for their day-to-day work; the LTT team can do little more to convince the kernel developers, as its point of view is considered biased by being the maintainers of the tool. So, if you want to see the tracing functionality making into the kernel, and therefore avoid having to chase an LTT patch for your latest kernel, you should contact the kernel developers by posting to the kernel mailing list: linux-kernel@vger.kernel.org.
Among many other things, there was a presentation about LTT made by Karim Yaghmour at the 2004 Kernel Summit in Ottawa in July 2004. Some suggestions made by the kernel developers at that meeting have been incorporated into 0.9.6pre4.
The main thrust behind making a 0.9.6 release is to provide support for the 2.6.x kernel series. As was discussed on the ltt-devl mailing list, this new version of LTT will depend on relayfs, a filesystem that grew out of the LTT development's team in trying to optimize and generalize the buffering scheme used by LTT. As such, relayfs is not being pushed as a replacement of any existing kernel functionality, but to help make the LTT lighter and making the optimized buffer mechanisms available to a larger audience.
Here's a summary of changes since the opening of the 0.9.6preX branch:
A special thanks to Tim Bird from the CE Linux Forum for going through the ltt-dev
list and picking up some of the pending contributions and integrating it all.
This new version includes many of the features we had decided to add to LTT since the RAS boF and all the features we had been asked to implement from the folks on the kernel mailing list. Mainly, we now have per-CPU buffering, TSC timestamping (when available). Also, the kernel tracing component is not a driver anymore. Rather, it is accessed through a system call, like other kernel services. The side-effect is that the tracer cannot be loaded as a kernel module anymore. Rather, it can only be built-in.
0.9.6pre2 is a development version and you are encouraged to use it with care.
With the decision by the OLS RAS BoF attendees to standardize on LTT as the main tracing tool for Linux, many enhancements have been made and others are coming to bring LTT's capabilities even further.
Here is a summary of the enhancements in the newly release 0.9.6pre1:
Of course, 0.9.6pre1 is a development release, so please use with care.
This release has been the longest in the coming. With over a year of development since the last stable release, the number of enhancements is substantial.
First and foremost, I must thank the various contributors for their work. They were many and their work is very much appreciated.
Here is a rundown of the main content of each of the preX increment:
Between pre6 and the final 0.9.5, quite a few fixes and a couple of enhancements have made their way into LTT. Here's a more detailed list:
[root@Eldorado /tmp]# tracedaemon -p TraceDaemon: Current event mask 0x00000000003FFFFF [root@Eldorado /tmp]# tracedaemon -m -eTIMER TraceDaemon: Set event mask 0x0000000000001000 [root@Eldorado /tmp]# tracedaemon -m -eSCHED TraceDaemon: Set event mask 0x0000000000000080Use: -p prints the current trace mask and -m sets the new trace mask according the -e and -D options provided. Notice that you can still trace a single pid using -P.
RTAI users should note that starting from 24.1.9 they will not need to patch RTAI anymore in order to use LTT with it, the proper changes have been integrated into the RTAI CVS.
A number of contributions which were submitted for pre6 have not made it into the final
0.9.5 because I wanted to stabilize the code before adding any major code body. These
contributions will be integrated into in the 0.9.6preX series.
0.9.5pre6 is now out and it is the last of the pre-0.9.5 series. I will wait a week or two for bug reports and bug fixes and will release 0.9.5-final. Here's a rundown of what you'll find in this release.
Takuzo O'Hara (Sony) has ported LTT to the MIPS. His port is now part of LTT, but it still needs some work since some of the trace statements in the arch/mips/kernel directory are commented out in order not to crash the machine. I don't have a MIPS-based machine on hand, but if you do then try it out and let us know what you get.
Jörg Hermann (Multilink Gmbh.) has provided icons for RTAI events (they didn't have any before this) and has contributed code to fix the display problems with RTAI's graphs and providing cross-platform reading capabilities for RTAI traces. The final code for fixing the display and enabling cross-platform reading capabilities for RTAI is based on his work.
Klaas Gadeyne (Leuven university) provided a bug-fix for the linking of the Visualizer.
Frank Rowand (MontaVista) submitted modifications in order to enable to Visualizer to properly build on Solaris machines and provided an update for the system call tables.
Michel Dagenais (École Polytechnique de Montréal) pointed out that there were brackets missing from the kernel patch for some functions in the fs directory. (Frank also reported this a few days later). Michel has been modifying LTT to do some very interesting measurements on Linux's IDE and filesystem code.
Andreas Heppel (Sysgo rt-solutions) provided a number of fixes for the Visualizer and the Event DB.
Theresa Halloran (IBM) provided code to fix bitops problems on the S/390.
Vamsi Krishna (IBM) provided a patch to enable the kernel to compile adequately when tracing is disabled.
Corey Minyard (MontaVista) pointed out (and, thankfully, ferociously debated his point) that there was a race condition in the load/unload/trace code (kernel/trace.c).
I've tested LTT's support for RTAI extensively and haven't found any problems (which doesn't mean that there aren't any). Cross-platform reading capabilities and graph display are now fixed (permanently). There remains the problem of getting RTAI to work on PowerPC. Stock 24.1.8 is broken for PPC and so is 24.1.7 (they still work very well on i386, though). So you may have to go back a couple of RTAI versions if you want to use RTAI on a PPC, but that's out of my hands ... When it gets fixed, I will test the support with LTT on PPC again.
Again, please test this release thoroughly and report any unusual problems. 0.9.5 -final should
be out soon.
As I said in the introduction section, 0.9.5pre5 contains many additions. Here are the details for each addition.
Theresa Halloran (IBM) submitted a patch for the S/390. Although this patch was integrated into pre5, Theresa has since then indicated that there were a few problems with pre5 on S/390 (such as autoconf's files being too old). These will be fixed shortly.
Greg Banks (Pocket Penguins) ported LTT to SuperH a while ago, but I never got to integrate his work. It was later updated by Andrea Cisternino (ST Microelectronics). Andrea's patch is part of pre5.
Philippe Gerum (Idealx) converted LTT's build system to autoconf. I spent a non-negligeable amount of time trying to figure out how autoconf goes about doing some of its magic. It was then somewhat ironic to read on the FSF's description of said package that it made life so much easier ...
Frank Rowand (MontaVista) submitted quite a few number of fixes and updates.
I introduced the concept of "Architecture Variant" to complement the "Architecture Type" in order to fix issues raised by some of Frank's updates. Some architectures, such as the ARM, the MIPS and the PPC, have many variants and it is not sufficient to know the "Architecture Type".
Steve Fink submitted a few fixes for the daemon's options parsing.
Tom Cox (Tripac) first reported that the visualizer died on emptry traces. It was a bug with the event database function that does a first-run analysis of the trace.
Dalibor Kranjcic (Hermes, not the French scarf company) reported that when instructing the daemon to trace only one PID, the visualizer showed the events that belonged to other processes, but not the ones belonging to that PID. This was actually a problem due to the lack of context created by the lack of events. I have partly fixed this, but be aware that it is entirely possible to compromise the visualizer's ability to intrepret the event sequences by playing with the daemon's options. If you play with the daemon's options and see something weird in the visualizer, first ask yourself if this is not due to your settings.
Support for RTAI has been updated. LTT 0.9.5pre5 can now successfully trace RTAI 24.1.7. All required patches are included. Although RTAI still needs to be patched, the trace statements are already part of it so the patch fixes some very minor issues. NOTE1: the Visualizer cannot display the RTAI graph correctly for now, this is a known bug. NOTE2: cross-platform reading capability is still not available for RTAI. Both issues should be solved within the next release, but you can still read the traces and view the event sequences without a problem in the current release. Note that the LTT patch to use 2.4.16 with RTAI and LTT includes the RTAI patch for 24.1.7. Hence, you only need to patch Linux 2.4.16 with the patch provided by LTT and then proceed to configure both the kernel and RTAI as you would usually. In fact, you won't find a patch for 2.4.16 in the "patches" directory of 24.1.7. The one I used comes from the CVS repository.
I've removed the restraints on the LTT development list. Anyone can now freely subscribe to
the list without requiring approval. Please keep the list for development issues only. There is
a users list and user questions should be posted there.
0.9.5pre4 adds user-space events logging capabilities and the ability to dynamically modify trace masks programmatically from user-space. To do so, a number of modifications have been made both in the kernel and in the user-space tools. And the trace format has changed yet again. It is now at 1.12. A patch implementing 1.12 is included with LTT 0.9.5pre4 for linux 2.4.16.
To accomodate the additions, a new library has been created, libusertrace. This library is found
in TraceToolkit/LibUserTrace. Use the usual "make; make install" and follow this with an "ldconfig"
in order to udpate the library database. Here is the summary of the API provided by the library:
int trace_attach
(void);
int trace_detach
(void);
int trace_create_event
(char* /* String describing event type */,
char* /* String to format standard event description */,
int /* Type of formatting used to log event data */,
char* /* Data specific to format */);
int trace_destroy_event
(int /* The event ID given by trace_create_event() */);
int trace_user_event
(int /* The event ID given by trace_create_event() */,
int /* The size of the raw data */,
void* /* Pointer to the raw event data */);
int trace_set_event_mask
(trace_event_mask /* The event mask to be set */);
int trace_get_event_mask
(trace_event_mask* /* Pointer to variable where to set event mask retrieved */);
int trace_enable_event_trace
(int /* Event ID who's tracing is to be enabled */);
int trace_disable_event_trace
(int /* Event ID who's tracing is to be disabled */);
int trace_is_event_traced
(int /* Event ID to be checked for tracing */);
If you are familiar with the kernel-space API to create and log custom events, you will notice that these functions are almost identical, to the exception of the mask manipulation functions which have no direct equivalent in kernel-space. The only difference being the trace function which is called trace_user_event() instead of trace_raw_event(). To get the complete description of each of the above and an insight in how they opperate, take a look at TraceToolkit/LibUserTrace/UserTrace.c. Basically, all interaction between the library and the trace device is done through the use of the ioctl() call.
The library uses a new trace device, /dev/tracerU, which can be automatically created when running the createdev.sh script found in the package's root directory. This device has the same major number as the /dev/tracer entry but has a minor number of 1 instead of 0. The /dev/tracerU device can be opened many times in parallel whereas /dev/tracer can only be opened once, usually by the trace daemon.
In order to facilitate usage of the API, I've included examples in the new Examples directory. There are 2 new sets of examples, one for user event logging and one for dynamic modification of trace masks from user-space.
Notice that mask modification is logged as part of the trace. The event database cumulates all the masks using a bitwise AND to obtain the minimal set of events that were traced during the whole trace. This, in turn, is used by the Visualizer to determine whether he can actually draw the trace or not.
0.9.5pre4 has been tested both on i386 and PPC and works fine. RTAI tracing still doesn't work but will be fixed for 0.9.5 final.
I've also added the complete text of my Master's thesis in the documentation section.
Since this was done at the École Polytechnique de Montréal, the text is in french.
The document gives a complete breakdown of the research behind LTT, its architecture and
performance. If you're looking for a summary of the thesis then have a look at the Usenix
article available in the same section.
Trace format 1.10 had a bug on the frontier between buffer changes because of size
changes in the data type of the event size data component. 0.9.5pre3 implements format
1.11 which corrects this.
With 0.9.5pre2 out, there are a couple of interesting additions to anyone wanting to make advanced
use of custom events and event analysis. Here is the new layout of the LTT sources:
You will find a patch for Linux 2.4.5 in the "Patches" directory which you can apply on a fresh copy of the sources from www.kernel.org.
In the "CustomReader" directory, you will find the code for a reader that uses LibLTT to read the trace and run a couple of custom event formatting operations. This is the prime example of retrieving and modifying custom formatting data about custom events from a trace. Along with this reader, you will find example custom event-using modules in "Test/CustomEvents". This later example shows how to create and log custom events from the kernel space.
To know how to build your own reader to retrieve and format the
custom events, I strongly encourage you to follow the example in
"CustomReader". Here are the functions that you will be interested
in to retrieve/modify event formatting data:
customEventDesc*
DBEventGetCustomDescription
(db* /* The database to which the event belongs */,
event* /* The event who's custom formatting data is looked for */);
char* DBEventGetFormat
(db* /* The database to which the event belongs */,
event* /* The event who's custom formatting data is looked for */,
int* /* Pointer to int where to store the format type */);
char* DBEventGetFormatByCustomID
(db* /* The database to which the events belongs */,
int /* The custom event ID who's custom formatting data is looked for */,
int* /* Pointer to int where to store the format type */);
char* DBEventGetFormatByCustomType
(db* /* The database to which the events belongs */,
char* /* The custom event type string who's custom formatting data is looked for */,
int* /* Pointer to int where to store the format type */);
int DBEventSetFormat
(db* /* The database to which the events belongs */,
event* /* The event who's custom formatting data is to be set */,
int /* The format type being set */,
char* /* Custom event formatting string to be set */);
int DBEventSetFormatByCustomID
(db* /* The database to which the events belongs */,
int /* The custom event ID who's custom formatting data is to be set */,
int /* The format type being set */,
char* /* Custom event formatting string to be set */);
int DBEventSetFormatByCustomType
(db* /* The database to which the events belongs */,
char* /* The custom event type string who's custom formatting data is to be set
*/,
int /* The format type being set */,
char* /* Custom event formatting string to be set */);
These seem arcane because they are taken out of context, but the
example in "CustomReader" shows how they are used. You may also
want to have a look at the commented headers of those function in
"LibLTT/EventDB.c". There is also the kernel functions which enable you to log the custom
formatting information:
Initial event creation:
int trace_create_event
(char* /* String describing event type */,
char* /* String to format standard event description */,
int /* Type of formatting used to log event data */,
char* /* Data specific to format */);
Example:
omega_id = trace_create_event("Omega",
NULL,
CUSTOM_EVENT_FORMAT_TYPE_XML,
"<event name=\"Omega\" size=\"0\"><var name=\"a_byte\" type=\"u8\"/></event>");
The first argument simply provides a short string description of the event in order to be able to search for it or recognize within the trace. The ID returned is used to log future events belonging to this type. The second argument which is also a string is only used when the trace_std_formatted_event() function described below is called. The third argument describes the type of custom formatting being used. These are the possible values: CUSTOM_EVENT_FORMAT_TYPE_NONE, no particular format CUSTOM_EVENT_FORMAT_TYPE_STR, print data as a string CUSTOM_EVENT_FORMAT_TYPE_HEX, print data as hexadecimal CUSTOM_EVENT_FORMAT_TYPE_XML, data is XML formated CUSTOM_EVENT_FORMAT_TYPE_IBM, data is formated according to IBM standards.
The last argument is the a string describing the event format. This will
be available from lttlib in user space once the trace has been read
and will be modifiable. When XML is the format (as is the case in the
above example), for instance, this last string is an XML string. Of
course this string can be a NULL and a format can be provided after the
trace has been loaded by libltt.
Destruction of a created event:
void trace_destroy_event
(int /* The event ID given by trace_create_event() */);
Example:
trace_destroy_event(omega_id);
Trace an event formated as a string:
int trace_std_formatted_event
(int /* The event ID given by trace_create_event() */,
... /* The parameters to be printed out in the event string */);
This is there for people who want to write simple strings from within
the kernel.
Example:
trace_std_formatted_event(theta_id);
Trace a custom raw event:
int trace_raw_event
(int /* The event ID given by trace_create_event() */,
int /* The size of the raw data */,
void* /* Pointer to the raw event data */);
This is the main function used to trace events.
Example:
trace_raw_event(omega_id, sizeof(a_byte), &a_byte);
The first parameter is the ID returned at the creation of the event. The second is the size of the data being logged and the last is a pointer to the event data itself.
For a complete example of how to use the complete event database API, take a look at the CustomReader and the Visualizer. There is a lot of detail to be written about the event database. I'll take the time to write complete doc one day ... For now: "Use the source, Luke".
In order to avoid confusions regarding which versions of LTT fit with which patch version, the
trace format version is now readily available to the user. The new patches will all have a format
which makes the trace format version evident. The following is an example:
patch-ltt-linux-2.4.5-vanilla-010909-1.10
This is an LTT patch for Linux 2.4.5 dating 09/09/2001 and generating traces having format 1.10. To
see which version the Visualizer supports, type "tracevisualizer -v". If you try to read a trace
which isn't supported by the trace visualizer, it will exit and provide you with the trace version
of the trace you are trying to read.
0.9.5pre2 still doesn't support RTAI. However, the RTAI CVS now includes all the instrumentation.
Hence, further versions of LTT will be much lighter since no patching of RTAI will be required.
Stay tunned ...
0.9.5pre1 is now out and includes cross-platform capabilities in order to read traces on machines having different endianness. For now, the only patch available for this is for 2.4.0-test10. This will shortly be fixed, now that this extension has been added. Keep in mind that the trace format has once more changed. 0.9.5pre1 can only read traces generated by the 2.4.0-test10 patch provided (which works on both PPC and i386, by the way). Currently, this cross-platform capability does not support RTAI. This should be part of 0.9.5pre2. Andy has reported that custom events don't work properly cross-platform. I'll be looking into this for pre2. Meanwhile, custom events and, hence, DProbes may not work with pre1.
Added to 0.9.5pre1 are additional capabilities to ease the browsing and analysis of the traces with the visualizer. This includes additional toolbars items that permit to disable/enable the display of certain icons in order to unclutter the display. Additionnaly, the graph drawing has been speeded up by means of omitting to draw overlapping lines. This has the effect of speeding up the display 10x on the local host and 100x over a network connection. The old capability of drawing all the lines can still be activated, if need be. Additionnaly, in order to ease zoom-in/zoom-out, one can now simply click on the left/right mouse button inside the graph to zoom-in/zoom-out. This is simpler than using the magnifying glass icons or the Ctrl+PgUp/Ctrl+PgDwn which required constant realignment with the left edge of the graph before zooming. Now, one only needs to click where the zoom is required and that point is kept in its place while all the rest is zoomed in/out. Eventually, I'd like this to be dynamically enabled/disabled as it is now always enabled.
Additionnally to Andy Lowe's work on cross-platform reading and Rocky Craig's work on the visualizer, many patch fixes have been provided. Bob Montgomery (HP) has provided a fix for a race condition that occured on mutli-processor machines. Peng Dai (Mission-Critical Linux) provided a fix for a problem that occured on SMP boxes with 2.4.0-test10 since these new versions use the NMI as an SMP watchdog. Both these fixes are part of the patch included with 0.9.5pre1.
On the upcoming side of things, Manu Sporny has submitted a patch for 2.4.2 using 0.9.4. Christophe Boyanique has submitted a patch for 2.2.18 using 0.9.4. Tom Cox has submitted a patch for rtai 1.6. All these are in the ExtraPatches directory on the ftp server. I haven't tested any of those, they are provided for your convenience, but please don't send any questions if they don't work properly. I'll definitely try those out and make sure they are in sync with 0.9.5pre2 and release them as part of that.
Greg Banks (Pocket Penguins) has worked on an SH port of LTT and has made some interesting suggestions
to the project. The SH patch will be included with LTT as soon as the cross-platform issues are
addressed. Meanwhile, Greg's suggestions to use the "-Wall" to build the daemon and the visualizer have
yielded some interesting insight on the tools' grey areas.
It's finally out. The stable release of LTT for PowerPC with support for both Linux and RTAI. This one had been in the pipeline for a while. Now that it's out, let's look at what's inside.
0.9.4 is a direct result of the 0.9.4preX series. Given the growth of LTT, expect to see more and more preX series as more and more people are contributing to this effort. To sum up, here is what transpired in the 0.9.4preX series:
On top of pre4, the final 0.9.4 includes numerous bug fixes and full support for both Linux and RTAI for the PowerPC. You can now use LTT to trace a PowerPC based machine and run the visualization tools to look at the traces on that same machine. The capability to read traces cross-platform has been submitted by Andy Lowe from Monta Vista and will be part of 0.9.5pre1 (which should be out very soon) along with other submissions.
Along with this new release, notice the addition of the "mailing lists" and the "links" sections. The first is a link to the newly created LTT mailing lists. The second is meant to provide entry points to other projects related to LTT. There are 3 mailing lists, one for LTT users, one for LTT announcements and one for LTT developers. For the moment, the third is restricted, but if you feel you can make a useful contribution please speak about it with the project manager.
Most files pertaining to LTT have now been moved to the FTP server of Opersys. Please take a look at http://www.opersys.com/ftp/pub/LTT/ to see what's in there. Notice the "ExtraPatches" directory which contains patches not part of the standard LTT package but that still do work with some versions of LTT to provide support for a certain version of the kernel or of RTAI.
Now that PowerPC support is out, I'm working on support for the ARM and the MIPS architectures (in alpha state for now) along with other enhancements to the user-side interfaces. As more and more architectures become supported, it becomes harder and harder to coordinate kernel patch releases with user-side tools of LTT. There will probably be a split somewhere down the line where kernel patches will be available apart from complete LTT releases. Also, contrary to other types of patches, updating LTT patches to recent kernels requires a certain amount of work and testing to ensure cross-platform compatibility. Hence, don't expect every kernel version out there to have LTT patches available for it. If it becomes a need to have a patch then "ask not what LTT can do for you, but what you can do for LTT" and think about submitting a patch.
I'd like to thank Lineo for their ongoing support for my work on this project. I'd also like
to thank everyone out there who has submitted contributions to LTT. Such contributions are the
building blocks for the future of LTT.
As you've read on the title page, you can now use DProbes and LTT to dynamically insert trace points anywhere you would like. A great deal of thanks for the IBM DProbes team, especially Richard and Vamsi, for their feedback and advice.
Although the following won't discuss how to insert probe points and how to install DProbes, I will discuss how it works and how the added functionnality can be used for other purposes too. That said, it is sufficient to say that once you have installed DProbes, and have installed LTT, as discussed in the Help files (see documentation section), you will be able to trace the system as usual with the addition of the dynamically inserted trace points.
Before going any further, note that the installation procedure has slightly changed for LTT. Rather than copying anything by hand into your favorite "bin" directory, simply "make install" once you've done compilling the daemon and the visualization tool and they will be installed, along with the corresponding scripts, in /usr/sbin. This can be changed in the respective makefiles.
Here are the functions added that enable DProbes to interact with LTT and that also provide the programmer with the capability of dynamically creating event types and logging the corresponding events:
custom.c is an example
kernel module that uses the custom event tracing facilities provided. And here is the output
resulting from tracing while this module is loaded and unloaded:
Syscall exit 975,040,384,033,092
Syscall entry 975,040,384,033,167 SYSCALL : init_module; EIP : 0x0804A82D
Memory 975,040,384,033,168 PAGE ALLOC ORDER : 0
Memory 975,040,384,033,173 PAGE ALLOC ORDER : 0
Memory 975,040,384,033,178 PAGE FREE ORDER : 0
Memory 975,040,384,033,179 PAGE FREE ORDER : 0
Event creation 975,040,384,033,184 NEW EVENT TYPE : Alpha
Event creation 975,040,384,033,188 NEW EVENT TYPE : Omega
Event creation 975,040,384,033,191 NEW EVENT TYPE : Theta
Event creation 975,040,384,033,194 NEW EVENT TYPE : Delta
Event creation 975,040,384,033,196 NEW EVENT TYPE : Rho
Alpha 975,040,384,033,209 Number 1, String We are initializing the
Omega 975,040,384,033,212 Number 25, Char c
Theta 975,040,384,033,213 Plain string
Delta 975,040,384,033,215 00 00 00 00 00 00 00 00
Rho 975,040,384,033,216 12
Syscall exit 975,040,384,033,217
Syscall entry 975,040,384,033,249 SYSCALL : close; EIP : 0x0804AE41
File system 975,040,384,033,251 CLOSE : 3
Syscall exit 975,040,384,033,255
Although DProbes is only available for the i386 (for now), the custom even tracing functions of LTT will work both on the i386 and the PPC.
For those of you who are interested in doing real-time in Linux and, consequently,
by the RTAI extensions to LTT, there will be a presentation about LTT for RTAI at
the
second Real-Time Linux Workshop.
For last year or so, anyone wanting to use LTT had to resort to a bothersome manipulation in order to ensure that the kernel was able to allocate the memory needed by the event buffers to record the trace data. This consisted of running the trace daemon at least once right after system startup, even for a very short time.
The trace driver used the __get_free_pages() function in order to allocate the large buffers it needed. Once it called __get_free_pages(), it then went through the allocated space page by page to ensure that the pages were locked into physical memory. Once that was done, the daemon could then call on mmap() in order to remap the driver's pages into it's own address space. This was done in order to avoid having to read() the data from the driver into the daemon's space and then write() it back to kernel space.
The problem with __get_free_pages() was that it failed if it were called late into system operation since the kernel was unable to find large contiguous regions of memory (by default, the trace daemon configures the driver to use 2M of memory). Hence, if you didn't run the daemon close to startup, it would fail to set the required memory and print something like: "TraceDaemon: Unable to set data regions used for tracing".
I ran into Stephen Tweedie at Usenix who suggested I use vmalloc() and said that __get_free_pages() was the wrong function to use. At that time, I didn't have the time to investigate this more, but I definitely took note of it. Later, when I started to play around with vmalloc(), I noticed that I was unable to call on remap_page_range() from the driver using the vmalloc()ated region to remap it to the daemon's space. Something wasn't working properly.
I finally understood that I had to call on remap_page_range() for each page returned by vmalloc() in order to lock it into memory. That's when I remembered that Steve Papacharalambous, from Lineo, had suggested I look into the BTTV driver which included some kind of wizardry in regard to memory allocation. And there it was, the guys that wrote the BTTV driver seem to have run into the same problem I had (of having to allocate large kernel memory chuncks which would then be remapped into a process's address space) and they had written these very nice functions that took care of all the dirty work: rvmalloc() and rvfree().
Late last month, I finally got to play around with rvmalloc() and rvfree() but for some reason, it still didn't work. When I called on the memory allocation routine the system would freeze. A couple of days ago, I went back to it and noticed that the shared-memory module from RTAI was also using the BTTV stuff. That's when I noticed a nice macro called REAL_SIZE(). That's when some part of my brain said: Gotcha. The problem was the trace daemon was passing a buffer size which wasn't page-bound and rvmalloc() relied on that. In the BTTV driver they didn't need to make sure that the size used was page-bound since the default frame-buffer size was fixed and page-bound. Hence, I used REAL_SIZE() which I renamed FIX_SIZE() (since that's what it actually does) and now the rvmalloc stuff from the BTTV driver works fine.
What this means is that it isn't necessary to start the trace daemon at least once close to system startup. Just fire it up whenever you need to trace and you will be fine.
I've also fixed a bug in the way the event masks were being passed onto the trace driver. This had been pointed out by Bao-Gang Liu when he first sent me his patches.
That said, given all the recent announcements about tracing tools for Linux, I've written
an editorial on
LinuxDevices.com about LTT where I discuss the past,
the present and the future of LTT. This should be an interesting read for whoever wants
to know more about LTT and how it is managed.
I had been working on a PowerPC port for a while and was on the verge of making and announcement when everything broke out in the same time. It seems many were inspired to write a PPC port during the same time frame. First, I got contacted by Bao Gang Liu from Agilent China. He's the first person to submit PPC code for LTT. The only part he had which I hadn't completed was the assembly necessary to trace the PPC. Though, his part about fooling the compiler to use another kernel_thread was great. Also using macros instead of complete assembly code head.S was as clean as that part should be.
I was working on trying to figure out the bug with the syscall entry when I got word from Andy Lowe from MontaVista that he had completed a port of LTT for PPC. They had also done a press release and the whole thing. Before looking at the code, I was most bothered by the fact that I hadn't been told that this was going on. In the past, I have never refused to accept outside help or contributions, though I do have a problem with putting the future of an open-source project at risk by providing alternative source code. Granted, there are situations when such things are necessary, but these are mostly situations where human conflicts cannot be resolved. This wasn't the case.
As a notice, be aware that I was not involved in producing the code available on MontaVista's site and, as such, can make no guarantees as to the quality of the source found there.
Having said that, Andy's code was helpfull in providing a solution to the syscall entry problem I had before. Though, the final solution to trace syscall entries is somewhat different and is a variation between Bao Gang's code and Andy's. Also, though not included in 0.9.4pre2, Andy wrote a wrapper enabling traces generated on a PPC to be read on an i386. Before including this in the standard tree, I'd like to see this be generalized and made independent of the build (since right now, you would have to build LTT to read PPC traces on i386 with explicit flags). As with other cross-platform features coming into LTT, I want to make sure it doesn't start being a mess.
To that end, I've added an "ArchType" field in the trace header. Using that, the tool can identify the origin of the trace and apply the right set of rules to read it, regardless of the architecture it is running on. The "TraceType" field has become the "SystemType" field (used to identify to OS). Therefore, as we go along, adding other OSes and the architectures they run on should be made easier.
0.9.4pre2 should be fairly stable. The patches provided and the user-side tools work on both the i386 and the PPC. Notice that the "Patches/" directory is missing the RTAI stuff, that was intentional as people wanting to trace RTAI should still use 0.9.3. The patches are applied as usual. For the Tools, you only need to do "make" and they should find out the architecture where they are being compilled and compile accordingly.
A thanks goes out to the Lineo guys for their ongoing support for this project. Their
involvement has made sure that this project remains very much alive and on the bleeding edge.
This is why I've been busy for the last couple of months. This has been complex to bring along, but it's very much worth it. I'd like to thank the Zentropix guys again, without which I wouldn't have been able to be full time on this and whithout who's expertise things would have taken longer to finalize.
Even if you don't need real-time, I strongly suggest you at least take a look at the screenshots and play around with the sample RT trace available. If you've wanted to know how real-time interacts with Linux, this will be extremely helpfull. All entry points into and out of RTAI are traced, as much as possible. Most interesting to observe are processes using the LXRT layer. With RTAI, a regular Linux process can become hard real-time using only one LXRT call (rt_make_hard_real_time(), by itself, this is very impressive) and it's interesting to observe what happens when Linux and RTAI interact with the same executable entity.
Most complex in bringing RTAI support to LTT was drawing the graphs. With Linux it was rather simple, we were either in Linux either in a process. At most, we had to determine which process was scheduled and draw the horizontal line at the right height. With RTAI things get much more complex. We can be in 4 different places at any time, the RTAI core, an RTAI task, the Linux kernel or a normal Linux process. Whith Linux, figuring out where we were was rather easy, look at the current event, look at the next one and we're done. With RTAI, nothing of the sort. To know where we are, we actually need to keep track of the sequence of events that occured prior to the current event. The best way to do so is to build a state machine that represents the behavior of an RTAI/Linux system. This way, we need to keep track of 2 things, the current state and the next event. This is the first time I've seen this used this way (correct me if I'm wrong) and it's a first implementation of the state machine. It isn't 100% perfect, but it's close.
Also, I've also generalized the way LTT deals with traces. Rather than having only one way to deal with a trace, LTT now recognizes trace types. Every trace type has it's own tables and it's own analysis and drawing functions. It is made in such a way as to promote as much reuse as the C language permits. What this means is that adding support for other operating systems to LTT should be fairly straight-forward. This might imply a name change, but that's not a problem. I know Richard Stallman is very interested in having this for Hurd, I'd like to help but I know little of Hurd. Actually, it's Mach, the underlying micro-kernel, that needs to be instrumented in order to provide tracing to Hurd. There's also the BSDs as had inquired Chris Small, the Usenix conference chair, at the last Usenix conference. And there are possibly other OSs. If anyone out there is interested drop me an e-mail.
That said, if you will only be using LTT to trace vanilla Linux then almost nothing visible has changed, except for the version number and the trace format (sorry but this is a binary format for compression's sake and it definitely will be in constant change for the foreseable future).
Note also that only the UP (Uni-Processor) scheduler is instrumented in RTAI. Therefore,
this won't work with the MUP (Multi-Uni-Processor) scheduler or the SMP scheduler. At least,
not for now. This means that when configuring the kernel you will be using with RTAI, make
sure you __DISABLE__ SMP support in the kernel config. Make also sure that the scheduler
module in [rtai_root]/modules is the UP scheduler. This only applies to RTAI tracing, normal
Linux tracing isn't subject to this.
Last June at the Usenix Annual Technical Conference a paper was presented describing
the intricate details of how LTT works and how it impacts the system it is observing.
This paper is now available in the documention section. Some things have changed since
the paper was written, but the essentials are there.
You can now hook onto any traced events using the following function:
trace_register_callback()
You don't need to have the tracer running for this to work.
The following module is an example of usage of this facility:
my_callback() will be called every time a packet goes out or comes in. This can be used
with other events too. Have a look at include/linux/trace.h to see all the events that
can be traced.
#define MODULE
#include
This is mostly usefull for security auditing. Give me feedback on this and how it serves
you best.
This one is for all you keyboard freaks out there (including myself) who would rather do everything on the keyboard rather than play around with the mouse. This release is mostly an aesthetics update. The core functionnality and kernel patch have not changed since the last release, though visualizing a trace and scrolling around has become much easier. The following accelerators have been added :
Here's one of those times where we take the champagne out (or another cup of coffee ...),
a new release of LTT. Most of the outside appearance of LTT hasn't changed. The inner-workings,
on the other hand, have changed subtantially ... :
After quite a number of months, I've decided to make the next release of LTT. Quite a
number of things have changed. The following is a non-exhaustive list :