In anticipation of doing some work on performance issues, I've been experimenting with our kernel profiler the last few days. For those who don't know, you can activate the profiler by including the /PROFILE option on the command line. On each timer interrupt the current EIP will be saved, after 30 sec a background task will convert the saved EIPs to function name/file name, count how often each function name occurs and write the results to a file %system_root%\profiler.log. There are 100 timer ticks per second, a 30 sec measuring interval will therefore generate 3000 EIPs.
This works very nice, but I'm running into one problem on my 300MHz test machine: conversion of the EIPs to function name by the background thread takes about 60 sec. Almost all of that time is spent looking up the correct .stabs entry from the info in the .sym file. This is done using a linear search. Since there can be 100000 .stabs entries in a .sym file, going through these for each of the 3000 measurements can mean 300000000 compares. Even worse, two passes are made through the .stabs list for each measurement, one to find the function name and one to find the file name. No wonder this takes 60 sec.
So, we need to improve the search. I've done some simulations, and when I use a binary search instead of a linear search, the performance is dramatically better. Time required to convert all measurements drops from 60 sec to 0.06 sec, a 1000 fold improvement.
To be able to use a binary search, the information in the .sym file needs to be sorted and uniform (each element contains the same type of data) which it currently isn't. We have .stabs entries defining a function (N_FUN), source line (N_SLINE) and source file (N_SO). Currently, we only use the .sym files to convert an address to function name/file name/source line, not the other way around (no function name to address for example). I'd like to drop using the .stab format in the .sym files and change to the following format:
+----------+ |header | +----------+ |symentries| +----------+ |strings | +----------+
The header would just contain the start location and number of symentries plus start location and total size of the strings. Each symentry would have the following info:
typedef struct tagSYM_ENTRY { ULONG_PTR Address; ULONG FunctionOffset; ULONG FileOffset; ULONG SourceLine; } SYM_ENTRY, *PSYM_ENTRY;
where Address is an address relative to the beginning of the module, FunctionOffset and FileOffset are offsets from the beginning of the strings section and SourceLine contains the source line number. The symentries are sorted by Address (done by rsym). Each Address would only appear once, and information would be made as complete as possible (e.g. when creating the symentries from the .stabs in the .nostrip.exe file each FunctionOffset would be set to point to the function name from the most recently encountered N_FUN .stab). This will allow us to retrieve all 3 pieces of information (function name, file name and source line) by doing just a single binary search.
Comments, thoughts, objections?
Gé van Geldorp.
-----Original Message----- From: ros-dev-bounces@reactos.com [mailto:ros-dev-bounces@reactos.com] On Behalf Of Ge van Geldorp Sent: 11. januar 2005 10:34 To: ros-dev@reactos.com Subject: [ros-dev] Proposal to change .sym files
Comments, thoughts, objections?
I like this. If you could put the .sym parsing functions in their own static library in the process (for use in reporterror.exe) then that would be even more great.
Casper
Ge van Geldorp wrote:
In anticipation of doing some work on performance issues, I've been experimenting with our kernel profiler the last few days. For those who don't know, you can activate the profiler by including the /PROFILE option on the command line. On each timer interrupt the current EIP will be saved, after 30 sec a background task will convert the saved EIPs to function name/file name, count how often each function name occurs and write the results to a file %system_root%\profiler.log. There are 100 timer ticks per second, a 30 sec measuring interval will therefore generate 3000 EIPs.
This works very nice, but I'm running into one problem on my 300MHz test machine: conversion of the EIPs to function name by the background thread takes about 60 sec. Almost all of that time is spent looking up the correct .stabs entry from the info in the .sym file. This is done using a linear search. Since there can be 100000 .stabs entries in a .sym file, going through these for each of the 3000 measurements can mean 300000000 compares. Even worse, two passes are made through the .stabs list for each measurement, one to find the function name and one to find the file name. No wonder this takes 60 sec.
So, we need to improve the search. I've done some simulations, and when I use a binary search instead of a linear search, the performance is dramatically better. Time required to convert all measurements drops from 60 sec to 0.06 sec, a 1000 fold improvement.
To be able to use a binary search, the information in the .sym file needs to be sorted and uniform (each element contains the same type of data) which it currently isn't. We have .stabs entries defining a function (N_FUN), source line (N_SLINE) and source file (N_SO). Currently, we only use the .sym files to convert an address to function name/file name/source line, not the other way around (no function name to address for example). I'd like to drop using the .stab format in the .sym files and change to the following format:
+----------+ |header | +----------+ |symentries| +----------+ |strings | +----------+
The header would just contain the start location and number of symentries plus start location and total size of the strings. Each symentry would have the following info:
typedef struct tagSYM_ENTRY { ULONG_PTR Address; ULONG FunctionOffset; ULONG FileOffset; ULONG SourceLine; } SYM_ENTRY, *PSYM_ENTRY;
where Address is an address relative to the beginning of the module, FunctionOffset and FileOffset are offsets from the beginning of the strings section and SourceLine contains the source line number. The symentries are sorted by Address (done by rsym). Each Address would only appear once, and information would be made as complete as possible (e.g. when creating the symentries from the .stabs in the .nostrip.exe file each FunctionOffset would be set to point to the function name from the most recently encountered N_FUN .stab). This will allow us to retrieve all 3 pieces of information (function name, file name and source line) by doing just a single binary search.
Comments, thoughts, objections?
Gé van Geldorp.
The only reason I didn't do something like this was because I was worried about the effect on build-time, which at the moment is already horrendous. If this sorting can be done quickly ( i.e. have minimal effect on build time), then I think it's a great idea.
Also, how much larger do you anticipate the sym files getting, and do we have the kmode address space available to devote to this increase in size with room to spare?
I apologize if some of these questions are stupid ;)
From: Royce Mitchell III
The only reason I didn't do something like this was because I was worried about the effect on build-time, which at the moment is already horrendous. If this sorting can be done quickly ( i.e. have minimal effect on build time), then I think it's a great idea.
My gut feeling is that it won't impact build time very much, but I won't know for sure until I've coded the thing.
Also, how much larger do you anticipate the sym files getting, and do we have the kmode address space available to devote to this increase in size with room to spare?
A STAB_ENTRY as currently used is 12 bytes, my proposed SYM_ENTRY is 16 bytes. There can be multiple STAB_ENTRYs for a single address, while there will be only 1 SYM_ENTRY. Strings are handled the same by both formats, they are stored in a string table. All in all, I'd guess the new sym files to be 10% larger than the old ones. Seems a reasonable price to pay.
I apologize if some of these questions are stupid ;)
No such thing as a stupid question. Perhaps asking on IRC whether accept() works is annoying, but it's still not stupid ;)
Gé van Geldorp.
Ge van Geldorp schrieb:
This works very nice, but I'm running into one problem on my 300MHz test machine: conversion of the EIPs to function name by the background thread takes about 60 sec. Almost all of that time is spent looking up the correct .stabs entry from the info in the .sym file. This is done using a linear search. Since there can be 100000 .stabs entries in a .sym file, going through these for each of the 3000 measurements can mean 300000000 compares. Even worse, two passes are made through the .stabs list for each measurement, one to find the function name and one to find the file name. No wonder this takes 60 sec.
Sometimes ago there has exist a profiling function which has used the hash table functions from ex\hashtable.c. The problem is that the hash table functions uses the splay tree functions which are broken. In my local sources I had replaced the splay tree by a simple double linked list. I've never see such speed problems. This was at the begining of 2003. I had used the profiling for compiling ros on ros.
I would like it if we can add the sym files as a no-load section to each binary. This protects from using wrong symbol files and does help for some binaries like packet.sys and packet.dll.
- Hartmut
On Tue, 11 Jan 2005 16:43:17 +0100 Hartmut Birr hartmut.birr@gmx.de wrote:
I would like it if we can add the sym files as a no-load section to each binary. This protects from using wrong symbol files and does help for some binaries like packet.sys and packet.dll.
- Hartmut
I agree that i like symbols better in their binaries than out.
From: art yerkes
On Tue, 11 Jan 2005 16:43:17 +0100 Hartmut Birr hartmut.birr@gmx.de wrote:
I would like it if we can add the sym files as a no-load section to each binary. This protects from using wrong symbol files and does help for some binaries like packet.sys and packet.dll.
I agree that i like symbols better in their binaries than out.
Hmm, maybe we should just use the *.nostrip.* stuff then and possibly convert at runtime? It doesn't make much sense to me to first create a stripped version and then add back the symbols.
Ge van Geldorp.
On Tuesday 11 January 2005 21:44, Ge van Geldorp wrote:
From: art yerkes
On Tue, 11 Jan 2005 16:43:17 +0100
Hartmut Birr hartmut.birr@gmx.de wrote:
I would like it if we can add the sym files as a no-load section to each binary. This protects from using wrong symbol files and does help for some binaries like packet.sys and packet.dll.
I agree that i like symbols better in their binaries than out.
Hmm, maybe we should just use the *.nostrip.* stuff then and possibly convert at runtime? It doesn't make much sense to me to first create a stripped version and then add back the symbols.
Ge van Geldorp.
Converting at runtime could be slow, but we definitly have to convert them into a format which we can save as one large block in memory (and not thousands of small blocks) and easily parse at runtime.
What about using GNU debuglinks? It works like this:
You create the binary with debug info. Then you copy the debug info from the binary into a debug/symbol file. The next step is to strip the debug info from the binary and at the same time add a debug link to the debug file. The debug link is a section in the binary (which can be added with objdump when stripping debug info from the binary) called .gnu_debuglink. It must contain the filename of the debug file (zero-terminated) and (aligned to 4 bytes) the CRC of the debug file (which makes sure that correct symbols will be used with the binary) GNU binutils support it so it's easy to create the symbol files and add the debug links to the binaries.
- blight
Anich Gregor schrieb:
Converting at runtime could be slow, but we definitly have to convert them into a format which we can save as one large block in memory (and not thousands of small blocks) and easily parse at runtime.
What about using GNU debuglinks? It works like this:
You create the binary with debug info. Then you copy the debug info from the binary into a debug/symbol file. The next step is to strip the debug info from the binary and at the same time add a debug link to the debug file. The debug link is a section in the binary (which can be added with objdump when stripping debug info from the binary) called .gnu_debuglink. It must contain the filename of the debug file (zero-terminated) and (aligned to 4 bytes) the CRC of the debug file (which makes sure that correct symbols will be used with the binary) GNU binutils support it so it's easy to create the symbol files and add the debug links to the binaries.
- blight
I'm not sure if you mean objcopy instead of objdump. Objcopy for pe images isn't usable, because it changes the file and section allignments. I think that we need a tool which creates our new debug section as coff object file from the *.nostriped.* and that we need our own linker scripts which adds the section at the correct position within the image. This needs also a little change in freeldr, because the symbol section must be located after the bss section in ntoskrnl.
- Hartmut
From: Hartmut Birr
I'm not sure if you mean objcopy instead of objdump. Objcopy for pe images isn't usable, because it changes the file and section allignments. I think that we need a tool which creates our new debug section as coff object file from the *.nostriped.* and that we need our own linker scripts which adds the section at the correct position within the image. This needs also a little change in freeldr, because the symbol section must be located after the bss section in ntoskrnl.
This is getting a bit over my head, I'm no hero with COFF, linker scripts and binutils in general. If I code the generation of the basic symbol file, would someone else be able (and willing) to change it to a COFF object file and do the linker magic?
Gé van Geldorp.
Ge van Geldorp schrieb:
From: Hartmut Birr
I'm not sure if you mean objcopy instead of objdump. Objcopy for pe images isn't usable, because it changes the file and section allignments. I think that we need a tool which creates our new debug section as coff object file from the *.nostriped.* and that we need our own linker scripts which adds the section at the correct position within the image. This needs also a little change in freeldr, because the symbol section must be located after the bss section in ntoskrnl.
This is getting a bit over my head, I'm no hero with COFF, linker scripts and binutils in general. If I code the generation of the basic symbol file, would someone else be able (and willing) to change it to a COFF object file and do the linker magic?
Gé van Geldorp.
Ros-dev mailing list Ros-dev@reactos.com http://reactos.com:8080/mailman/listinfo/ros-dev
Hi,
the coff format is very simple if no relocation and debug infos are needed. A good starting point is http://www.delorie.com/djgpp/doc/coff. I've add a little diff. Rsym puts each symbol info in a file named symbol.coff. Ntoskrnl is build with the symbol section. While inspecting ntoskrnl.exe, I've found some other bugs. Ntoskrnl.exe contains always a .stab and a .stabstr section. There exist relocation entries for the .stab and .stabstr section. There exist relocation entries for addresses after the end of the image.
- Hartmut
From: Hartmut Birr
the coff format is very simple if no relocation and debug infos are needed. A good starting point is http://www.delorie.com/djgpp/doc/coff. I've add a little diff. Rsym puts each symbol info in a file named symbol.coff. Ntoskrnl is build with the symbol section.
Thanks, your code works great for ntoskrnl.exe. However, I'm having trouble getting it right in other executables. If I set scnhdr.s_flags to "CONTENTS, ALLOC, LOAD, DATA" (0xc0000040, like .rsrc) symbol.coff is properly linked into the final executable, but it will take up space in the loaded image. If I set scnhdr.s_flags to "CONTENTS, READONLY, DEBUGGING, NEVER_LOAD, EXCLUDE" (0x42000802, like .stab) then the final executable does not include the .rossym section.
While inspecting ntoskrnl.exe, I've found some other bugs. Ntoskrnl.exe contains always a .stab and a .stabstr section. There exist relocation entries for the .stab and .stabstr section. There exist relocation entries for addresses after the end of the image.
Yes, I noticed that too. There's another problem: at the moment, we do 2 separate links, one to get driver.nostrip.sys, and then another one to get driver.sys. We use the symbolic information from driver.nostrip.sys but execute driver.sys. Unfortunately, it seems the code in driver.sys and driver.nostrip.sys can differ in small but subtle ways. This seems to happen when static libraries are used, like in tcpip.sys. Since this makes the symbolic info not match the executable code, the symbolic info is useless.
So, I'm thinking about changing rsym to take driver.nostrip.sys as input and produce a driver.sys as output, by stripping the .stab and .stabstr sections, removing no longer needed relocation entries and removing symbolic information. While it's at it, it can add the .rossym section too. It should speed up the build a little bit too, since the second link is replace by some chopping.
Gé van Geldorp.
Ge van Geldorp schrieb:
However, I'm having trouble getting it right in other executables. If I set scnhdr.s_flags to "CONTENTS, ALLOC, LOAD, DATA" (0xc0000040, like .rsrc) symbol.coff is properly linked into the final executable, but it will take up space in the loaded image. If I set scnhdr.s_flags to "CONTENTS, READONLY, DEBUGGING, NEVER_LOAD, EXCLUDE" (0x42000802, like .stab) then the final executable does not include the .rossym section.
Hi,
have you modified the linker script files from gcc? You must add the section definition like the changes in ntoskrnl.lnk.
- Hartmut
From: Hartmut Birr
Sometimes ago there has exist a profiling function which has used the hash table functions from ex\hashtable.c. The problem is that the hash table functions uses the splay tree functions which are broken. In my local sources I had replaced the splay tree by a simple double linked list. I've never see such speed problems. This was at the begining of 2003. I had used the profiling for compiling ros on ros.
Yes, sounds like the same profiling stuff I'm using. I already talked to Casper (original author of the hashtable stuff) about the splay tree.
Note that my profiling experiment is sort of a worst-case scenario. I'm using a 300 MHz machine which is not exactly roaring with power. I have a foreground thread running the functions I want to profile, which is completely CPU bound. So basically the foreground and profile-conversion thread each get 50% of the CPU. Timing was done by calling KeQueryTickCount() at beginning and end of the conversion and printing the difference.
Compiling ros on ros might be a little different, as that might not be CPU bound. Throw in a faster CPU and the conversion might go a lot faster.
Gé van Geldorp.
(Original message here: http://reactos.com:8080/archives/public/ros-dev/2005-January/001457.html too long to quote)
It seems most people liked the idea of getting rid of the .sym files and incorporate the symbolic information into the PE files. So, that's what happens now.
rsym will now take a file.nostrip.ext, create the ReactOS symbolic info, strip unneeded sections (.stab and .stabstr), strip COFF symbols, fixup the relocation entries, add the ReactOS symbolic info as a .rossym section and write the output to file.ext. The .sym files are gone. Since this gets rid of the second link sequence we did before to obtain file.ext, the build is even a little bit faster now.
It's a bigger change than I expected when I started on it. I've done a lot of testing in the last few days, but it is entirely possible that I missed something. If so, let me know and I'll try to fix it asap.
You can delete any files in \reactos\symbols (and remove that directory). A "make clean" is in order after "svn up".
Gé van Geldorp.
Ge van Geldorp wrote:
(Original message here: http://reactos.com:8080/archives/public/ros-dev/2005-January/001457.html too long to quote)
It seems most people liked the idea of getting rid of the .sym files and incorporate the symbolic information into the PE files. So, that's what happens now.
Does addr2line still work, or can you provide one for us?
From: Royce Mitchell III
Does addr2line still work, or can you provide one for us?
Yes, addr2line still works, it uses file.nostrip.ext which is unchanged. The changes are in file.ext. It shouldn't be to hard to create an alternative tool but there doesn't seem too much point in that.
Gé van Geldorp.
Ge van Geldorp wrote:
From: Royce Mitchell III
Does addr2line still work, or can you provide one for us?
Yes, addr2line still works, it uses file.nostrip.ext which is unchanged. The changes are in file.ext. It shouldn't be to hard to create an alternative tool but there doesn't seem too much point in that.
Gé van Geldorp.
Sounds very nice! :-)
Ge van Geldorp wrote:
Yes, addr2line still works, it uses file.nostrip.ext which is unchanged. The changes are in file.ext. It shouldn't be to hard to create an alternative tool but there doesn't seem too much point in that.
It's my understanding that using addr2line against the nostrip file can be off sometimes, because of subtle differences between the two files. Also, there will be times where there are no nostrip files, and we may have to figure out what's going on from the files the user is using.
From: Royce Mitchell III
It's my understanding that using addr2line against the nostrip file can be off sometimes, because of subtle differences between the two files.
I think this problem is solved now too. It was caused by the two links we used to do, one to get file.nostrip.ext and then another one to get file.ext. Especially when static libs were involved the second link might produce a slightly different binary than the first.
The second link is gone now. The code and data is copied unchanged from file.nostrip.ext to file.ext.
Also, there will be times where there are no nostrip files, and we may have to figure out what's going on from the files the user is using.
If symbolic info was present in file.nostrip.ext (i.e. it was compiled with -g, which at present implies DBG := 1), then it will also be present in file.ext. So, the backtrace on e.g. BSODs should print the function name/source file/source line in addition to the address. Which means we won't have to use addr2line very often. If there was no symbolic info in file.nostrip.ext, addr2line wouldn't be of help anyway.
But as I said, creating a util like addr2line for the final file.ext is a piece of cake, it's just stringing together some calls to the rossym lib. I'll create something this week.
Which brings me to my last point: I'd like to enable -g (creation of symbolic info by the toolchain) for all builds, both DBG := 0 and DBG := 1. The only downside I see is it will take more diskspace to build ReactOS (currently 725 MB without symbolic info, 2477 MB with) and the final iso will be slightly larger (13 MB instead of 10MB). Personally I think this is outweighed by the possibility to have meaningful backtraces. Any objections?
Gé van Geldorp.
Which brings me to my last point: I'd like to enable -g (creation of symbolic info by the toolchain) for all builds, both DBG := 0 and DBG := 1. The only downside I see is it will take more diskspace to build ReactOS (currently 725 MB without symbolic info, 2477 MB with) and the final iso will be slightly larger (13 MB instead of 10MB). Personally I think this is outweighed by the possibility to have meaningful backtraces. Any objections?
Gé van Geldorp.
Please do. I was going to propose this eventually to have meaningful error reports for releases. It's a step in the right direction towards removing the DBG and KDBG configurations to minimize build errors.
Casper