Debugging

From ReactOS Wiki
Revision as of 09:27, 1 March 2018 by ThFabba (talk | contribs) (Dynamic: Add "Sampling the Stack" section)
Jump to: navigation, search

This page describes different methods of debugging ReactOS and the steps necessary to debug ReactOS.

Introduction

To be able to help ReactOS development, whether this be participating in the development of the source code or taking part in crucial testing, you are going to need knowledge of how to generate useful debug logs.

Useful debug logs are essential pieces of information which the developer needs to quickly pinpoint and identify exactly what the operating system is doing. Many people know how to get default debug output from the operating system, but this is generally not particularly useful for locating problems, especially bugs.

This article aims to give users knowledge not only on how to generate a debug log, but on how to generate a useful debug log which can be used directly to assess what the operating system is doing.

Available debugging methods

There are various methods to debug ReactOS, some require more knowledge than others. These are listed below.

Debugging through text messages

This is the easiest method for receiving debug information from ReactOS.

Serial Port

The serial port is the most common method used for receiving debug messages from ReactOS. The method used for receiving data from the serial port depends on whether you run ReactOS in a virtual machine or on a real computer. If you plan to use virtual machine, you might want to consider using Com0com instead of named pipe for connecting with virtual serial port. Also you may try to test the serial port with RS232 test software.

Virtual machines

How to handle serial output from virtual machines can be found on the VM specific debugging pages:

Real computer: Physical serial cable

You will need a physical serial cable if you want to receive debug messages from a real computer through the serial port. This method also requires two computers (one on which you test ReactOS and another one for receiving the debug messages). The ReactOS test computer must have a serial port. Note that a USB-serial adapter is unsuitable, but a PCI serial card should work.

The cable needed for this debugging method is a Null-Modem serial cable. You should find it in many computer shops for less than 10 dollars. If you don't have one ready, you can also build one:

DTE1_______________________________________________DTE 2

9pol 25pol (female)__________________________25pol 9pol (female)
5    7  ---GND---------------------GND-------  7   5

2    3  ---RxD--------. ,----------RxD-------  3   2
                       X
3    2  ---TxD--------' `----------TxD-------  2   3

7    4  ---RTS--------. ,----------RTS-------  4   7
                       X
8    5  ---CTS--------' `----------CTS-------  5   8

4   20  ---DTR--------. ,----------DTR------- 20   4
                       X
6    6  ---DSR--o-----' `-------o--DSR-------  6   6
                |               |
1    8  ---DCD--'               `--DCD-------  8   1

Connect the cable to the first serial port of both computers.

Then use a Terminal application like PuTTY or Windows HyperTerminal on the computer for receiving the debug messages. Set it up to listen to the first serial port (COM1 [3F8/IRQ4]) and a baud rate of 115200.

After that, boot ReactOS (Debug) on the test computer and you should receive debug messages. If this doesn't work, check your hardware and your freeldr.ini configuration.

Serial hardware can be tricky to get right, but be persistent. There are a few things to remember:

  • Plan which connections are DTE and which DCE, and which gender each has. Know which serial port (1 or 2) you're connecting on each computer.
  • If you use a PCI serial card, it could be necessary to pass the serial port adress to the kernel (for details see below).
  • Get the right kind of null modem. There are a few ways to make them and not all are the same.
  • Use shorter cables as much as possible.
  • Use a serial terminal program such as HyperTerminal or Minicom to observe the remote computer. If you don't see data you can recognize, then something is wrong.
  • GDB remote commands start with $ and end with ;. You'll be able to recognize them that way.
  • Note: use these settings (In HyperTerminal)
    • Bits per second: 115200
    • Data bits: 8
    • Parity: None
    • Stop bits: 1
    • Flow control: Hardware
  • Note: Some older BIOSes may have problems with a baud rate 115200. Try instead 9600 on test computer by editing the file freeldr.ini and use parameter /BAUDRATE=9600. Of course you have to change the baudrate also to 9600 on receiving computer.

How to start Serial terminal on Linux

  • Firstly you need to have installed cu terminal program, for rpm-based systems cu is in uucp-V.V.V.rpm package.
    • on yum-based system run: sudo yum -y install uucp
    • on apt-based system run: sudo apt-get install uucp
  • Some Linux distributions require the next change in /etc/group file: add your user (or maybe root) to group dialout
  • Then run:
sudo cu -s 115200 --parity=none -l /dev/ttyS0

Here

  • /dev/ttyS0 is your COM port name, you could find its name by reading dmesg|grep tty command output.
  • also "-e -o" keys could be used instead of --parity=none

Troubleshooting

In case of "Line in use"/"Permission denied" error:

user1~ # cu -s 115200 --parity=none -l /dev/ttyS4
cu: open (/dev/ttyS4): Permission denied
cu: /dev/ttyS4: Line in use
user1~ #

make sure you have included your user1 to dialout group.

Serial terminal through FreeBSD

cu goes preinstalled in FreeBSD.

Run in console:

sudo cu -s 115200 -e -o -t -l /dev/cuau0

here /dev/cuau0 is your serial port (COM) device name, find right name of your COM port in in the output of dmesg command. And for other keys:

  • "-e -o" options together mean no-parity
  • -l /dev/Xdev specifies COM device name
  • -t denotes connection is hardwired to a host on a dial-up line (not sure is this key really needed).

And to collect log output into a file for sending, run this script:

DATE=`date +"%F_%H%M%S"`
screen -dmS ROSlogger script MyMachine1-ROS-debug-$DATE.log sudo cu -s 115200 -e -o -t -l /dev/cuau0

It will write log into file named MyMachine1-ROS-debug-$DATE.log, here $DATE will be the time when script was started. You could change here MyMachine1 to your machine name.

Debug text output to file

Choose ReactOS (Log file) in the boot menu. The debug messages will go to a file called debug.log. This method has some limitations. Fatal system error messages will not appear in the log file. To redirect the output into another file, edit the kernel parameter /DEBUGPORT=FILE in freeldr.ini. For example:

Options=/DEBUG /DEBUGPORT=FILE:\Device\Harddisk0\Partition1\debug.log /SOS

Or:

Options=/DEBUG /DEBUGPORT=FILE:\ArcName\multi(0)disk(0)fdisk(0)\debug.log /SOS

Debug text output to screen

Choose ReactOS (Screen) in the boot menu.

Or edit freeldr.ini to contain an entry like the following:

[ReactOS_Debug]
BootType=Windows2003
SystemPath=multi(0)disk(0)rdisk(0)partition(1)\ReactOS
Options=/DEBUG /DEBUGPORT=SCREEN /SOS


Advanced option: Debugging the debug logger

Sometimes things go wrong and it becomes necessary to debug the debug logger, say SCREEN logger. To do this, it is possible to turn on more than one logger, by specifying it on kernel command line options like this: Edit freeldr.ini to contain an entry like the following:

[ReactOS_Debug2]
BootType=Windows2003
SystemPath=multi(0)disk(0)rdisk(0)partition(1)\ReactOS
Options=/DEBUG /DEBUGPORT=SCREEN /DEBUGPORT=COM1 /SOS

Changing the BAUD rate

If you think that 115200 is way too slow and your serial connection supports higher speeds, like virtual com ports do, you can change it. Note that some BIOSes on older test computer may have a problem with 115200. In that case, use 9600 instead.

1. Open the freeldr.ini in the reactos installation's root folder.

2. Locate the "[ReactOS_Debug]" section

3. Change setting to something like "/BAUDRATE=921600" (tested to work with hyperterminal and putty)

4. Save file.

5. Change your terminals BAUD rate.

Changing the serial port address

Edit the kernel parameter /DEBUGPORT=COM. This could be necessary, if you use a PCI/PCIe/PCMCIA/ExpressCard serial card on real hardware. This is normally used for notebooks without built-in serial port. For example:

Options=/DEBUG /DEBUGPORT=COM:0xCC00 /BAUDRATE=115200 /SOS

See also: Chromium OS‎ Serial Debugging HOWTO on how to determine the I/O address of inserted extension card.

Note: Reactos does not (yet!) support MMIO-based (modern) serial extension cards.

KDBG

See kdbg command reference for more information about the built-in kernel debugger.

GDB

To use GDB as a kernel debugger, see GDB.

Needed Items:

Start QEMU as you normally would, but add the following command line parameters:

-s -S

This is done so that QEMU starts in the STOPPED state, and allow you to connect using GDB. Now it's time to get GDB off the ground.

  • (Assuming you are in the RosBE command line), enter “gdb” to start GDB.
  • Enter “file ./output-i386/ntoskrnl/ntoskrnl.exe” to tell GDB where to load information about the kernel.
  • Enter “set disassembly-flavor intel” if you prefer Intel syntax.
  • Enter “target remote localhost:1234” to connect GDB to QEMU.
  • Enter “c” (for “continue“) to have GDB instruct QEMU to start/continue execution of the emulation.
  • To manually pause execution, make sure your GDB window has focus and simply enter <CTRL>+<C>

WinDbg

Main article: WinDBG

To take full advantage of WinDBG, you need to compile reactos with MSVC to get pdb symbols. For MSVC builds this is the default debugging style. If you want to use gcc builds, you need to compile with WINKD option set to TRUE (you can either use CMake-GUI and edit the value after configuring and then reconfigure, or you can edit the default value in the options.cmake file) Another possibility is to replace ntoskrnl.exe and kdcom.dll built with the WINKD = TRUE option. You can also replace kdcom.dll with the one from Windows 2003, which has a few more features, like reconnect and break-in, that don't work properly with reactos' own kdcom atm.

Generating even more output

In order to get meaningful debug output it is sometimes necessary to enable extra verbosity.

Turning on verbosity at compile time

ReactOS Style

Nearly all ReactOS modules use the built in "ReactOS style" debugging functionality. This style is characterized by:

  • Verbosity level is usually defined per file.
  • Only 2 message levels:
    • always enabled (DPRINT1)
    • only enabled when NDEBUG is not defined (DPRINT)

Files that follow this style can easily be spotted by this code:

 #define NDEBUG
 #include <debug.h>

To enable full verbosity just comment out the "#define NDEBUG", and remember to uncomment it when submitting patches.

Adding own debug messages

Be sure that you included debug.h

 #include <debug.h>

And use DPRINT / DPRINT1, both work like printf, but have some different codes.

WINE Style

sample line to enable debug channels in usermode applications, in file:

boot/bootdata/hivesys.inf

; Debug channels
HKLM,"SYSTEM\CurrentControlSet\Control\Session Manager\Environment","DEBUGCHANNEL",0x00020000,"+ole,+rpc"

Turning on verbosity at runtime

The easiest way to turn on debug verbosity on any particular component is to use DEBUGCHANNEL environment variable. For example, to get all debug messages from MSI, simply run in CMD:

set DEBUGCHANNEL=+msi

Then, run the app you wish to test. You can add debug messages from multiple components, for example:

set DEBUGCHANNEL=+msi,+rpc,+ole

You can find a complete list of the debuggable components here.

After closing down CMD, debug verbosity will change to default.

<Describe details on how to set verbosity level, turn on debug from all components.>

Breaking into the built-in kernel debugger

Bugchecks occur when the operating system can no longer operate safely and to avoid corrupting data, it halts operation. This will normally throw up a Blue Screen Of Death, but if you have the kernel debugger activated, it will drop you into the prompt giving you access to explore the system state. By default ReactOS debug builds have the integrated kernel debugger (kdbg) enabled. Release builds do NOT have this feature enabled and will only display a blue screen.

There are two ways for forcing a bugcheck, each one employing a different method:

Dynamic

If you have a debug build and want to halt the system for any given reason and break into the the kernel debugger, you can force a bugcheck from the keyboard by simply typing: TAB+K


Remember that kdbg output goes out through the serial port, but it receives input from the keyboard by default.

To allow input through the serial port as well start with ReactOS with the boot option "ReactOS (RosDbg)" or add the command /KDSERIAL to your freeldr.ini boot options.

Breaking on user mode Exceptions

For each type of exception known by KDB you can set the condition when KDB should be entered individually for first and last chance. The possible settings for the conditions are never, umode, kmode and always.

  • never: kdbg won't be entered when exceptions are raised
  • umode: kdbg will be entered when the exception was raised in usermode
  • kmode: kdbg will be entered when the exception was raised in kernel mode
  • always kdbg will be entered on every exception

To change the condition to enter KDB on all exceptions to "always" (default is "kmode"), enter the debugger and type:

set condition * first always

Type "cont" to continue normal execution.

Sampling the Stack

If the system shows very high CPU usage or appears frozen, it may be possible to find the culprit by sampling the processor's call stack.

To obtain a simple sampling profile using the kernel debugger,

  • break into the debugger using TAB+K during a time of high CPU usage,
  • issue the bt command to print a stack backtrace (see #Generating a backtrace),
  • issue the cont command to continue execution,
  • repeat this process until you've obtained 3-5 backtraces, or the period of high CPU usage stops.

The backtrace samples obtained this way often show where processor time is being spent, and can pinpoint functions that have bugs or require optimization.

Please note that backtraces are of limited use if they have not been translated. When showing them to other people, be sure to translate the addresses, or include the exact binary files you used (or a link to the iso) to allow others to perform the translation.

Static

This is useful when you want to halt the operating system when it hits a particular area of code you might be debugging. This is especially useful as you can get an immediate back trace to see where the code flow came before the bugcheck was forced.

This is done by using KeBugCheck() or ASSERT() in the code.

Generating a backtrace

In order to generate a backtrace you must break into the KDBG prompt.

Enter 'bt' and hit RETURN, you should see something similar to the following:

 (drivers\filesystems\vfat\rw.c:809) <\ReactOS\system32\kernel32.dll>
 Entered debugger on embedded INT3 at 0x0008:0x800935f2.
 kdb:> bt
 Eip:
 <ntoskrnl.exe:935f3 (lib\rtl\i386\debug_asm.S:31 (DbgBreakPoint@0))>
 Frames:
 <vfatfs.sys:97de (drivers/filesystems/vfat/misc.c:111 (VfatDispatchRequest))>
 <vfatfs.sys:9b25 (drivers/filesystems/vfat/misc.c:167 (VfatBuildRequest))>
 <ntoskrnl.exe:3ab23 (ntoskrnl/io/iomgr/irp.c:1088 (IofCallDriver))>
 <ntoskrnl.exe:36206 (ntoskrnl/io/iomgr/iofunc.c:686 (IoSynchronousPageWrite))>
 <ntoskrnl.exe:59daa (ntoskrnl/mm/section.c:6330 (MmspWriteDataSectionPages))>
 <ntoskrnl.exe:244c6 (ntoskrnl/ex/work.c:162 (ExpWorkerThreadEntryPoint))>
 <ntoskrnl.exe:70e90 (ntoskrnl/ps/thread.c:134 (PspSystemThreadStartup))>
 <ntoskrnl.exe:7b142 (ntoskrnl\ke\i386\ctxswitch.S:258 (KiThreadStartup@156))>
 kdb:> 

The bt command will show a backtrace of the currently attached thread, so it may be necessary to use the 'thread attach' command, please refer to the kdbg manual for more details.

Examining this in more detail, we can see that the bugcheck occured via the INT3 operation which then dropped us into kdb. The next line shows us Eip which is the instruction pointer, and this points to the last address before the system halted.

Following on from that are the frames. This is the important part of generating our backtrace and it contains all the function addresses in the buildup to the bugcheck. This is the crutial information developers need to understand the codeflow before the bugcheck.

Translating Addresses

Occasionally, there will come a time when you will need to manually translate addresses. When kdbg is not enabled and a bugcheck occurs, you will be presented with a stack trace similar to the following:

(subsystems\win32\csrss\win32csr\conio.c:1101) Console_Api Ctrl-C 
*** Fatal System Error: 0x00000001 (0x80079279,0x00000000,0x0000FFFF,0x00000000)
<\SystemRoot\System32\NTOSKRNL.EXE: 29bb> <\SystemRoot\System32\HAL.DLL: 4749> <\SystemRoot\System32\NTOSKRNL.EXE: 54cb4> <\SystemRoot\System32\NTOSKRNL.EXE: 582bf> <\SystemRoot\System32\NTOSKRNL.EXE: 583fd> <\SystemRoot\System32\NTOSKRNL.EXE: 89956> <\SystemRoot\system32\drivers\videoprt.sys: 2417> <\SystemRoot\system32\drivers\vbemp.sys: 17f5> <\SystemRoot\system32\drivers\vbemp.sys: 19cf> <\SystemRoot\system32\drivers\videoprt.sys: 1c48> <\SystemRoot\System32\NTOSKRNL.EXE: 34c17> <\SystemRoot\System32\NTOSKRNL.EXE: 21e0> <\SystemRoot\System32\NTOSKRNL.EXE: 2908> <\SystemRoot\System32\NTOSKRNL.EXE: 29bb> <\SystemRoot\System32\NTOSKRNL.EXE: 85fa8>

As you can see, this is largely the same as the information presented when issuing a 'bt' command in kdbg. The problem here however, is that only the addresses are given. As these addresses are different for everyone's builds, this information is useless for anyone trying to follow what events occurred in the lead up towards to bugcheck.

The solution here is to translate the addresses into human readable function names. This is done via a tool named 'raddr2line' which is a modified version of the Unix tool 'addr2line'. This tool will translate the addresses given to it into file names and line numbers. It does this by using debug information in the executable files to associate the address with this human friendly info and outputs it into the console. This information can be pasted into the above debug log alongside the addresses providing the developers with a detailed stack trace.

raddr2line is included in the Reactos Build Environment. It is invoked in the following way :

raddr2line <file> <address>

So taking the bottom address in the above stack trace :

C:\Users\Ged\MyFiles\ReactOS\clean_source>raddr2line ntoskrnl.exe 85fa8

C:\Users\Ged\MyFiles\ReactOS\clean_source\output-i386\ntoskrnl\ntoskrnl.exe
obj-i386\ntoskrnl\ex\zw.S:253 (ZwClearEvent)

we can see here that the address translation for 0x85fa8 is line 253 in file ntoskrnl\ex\zw.S (this will differ if you try it on your build)

This information can now be added into the above stack trace as follows :

<\SystemRoot\System32\NTOSKRNL.EXE: 29bb>    <enter next one here>
<\SystemRoot\System32\NTOSKRNL.EXE: 85fa8>   obj-i386\ntoskrnl\ex\zw.S:253 (ZwClearEvent)

Enabling Kernel Tracing

Refer to this article for instructions on how to enable kernel tracing.

Debug Page Heap (DPH)

Based on the functionality of Windows Page Heap Verification, this mechanism is useful for more in-depth debugging of user mode Heap issues (crashes in ntdll:heap functions). It can be enabled per application or globally (for the whole system).

To enable DPH per particular application, gflags.exe is needed. This program is a part of the packages Debugging Tools for Windows and Windows Support Tools. Be sure to download the 32-bit package. GFlags.exe must be executed from ReactOS with the following syntax: "gflags /p /enable application.exe /full" and the application executed afterwards. Debug can be disabled by rerunning gflags with /disable switch or by rebooting.

To enable DPH system-wide, you need to apply the following patch to the ReactOS source :

Index: lib/rtl/heap.c
===================================================================
--- lib/rtl/heap.c	(revision 64030)
+++ lib/rtl/heap.c	(working copy)
@@ -1234,6 +1234,8 @@
     NTSTATUS Status;
     ULONG MaxBlockSize;
 
+    RtlpPageHeapEnabled = TRUE;
+
     /* Check for a special heap */
     if (RtlpPageHeapEnabled && !Addr && !Lock)
     {
@@ -1254,6 +1256,8 @@
         Flags &= HEAP_CREATE_VALID_MASK;
     }
 
+    if (!Addr) Flags |= HEAP_FLAG_PAGE_ALLOCS;
+
     /* TODO: Capture parameters, once we decide to use SEH */
     if (!Parameters) Parameters = &SafeParams;

Special Pool

Special Pool is the debug version of the kernel pool and helps detect overruns (optionally underruns) as well as uses after free.

The following constraints apply to the use of Special Pool:

  • Only allocations of 4088 bytes and smaller can use special pool.
  • Many Memory Manager structures rely on residing in specific memory regions, so they cannot be placed in Special Pool. Unless specifically trying to solve an Mm issue it is best to avoid placing any Mm allocations in Special Pool.
  • There is only a limited number of pages in the virtual address space reserved for Special Pool, and every allocation occupies two pages. Hence you may run out of pages quickly if you direct too many allocations to use special pool. This will be indicated by a "Special pool: No PTEs left!" debug print, and the allocations will fall back to regular pool. When enabling special pool on all allocations, this already happens at boot.

To enable Special Pool:

  • To use Special Pool for a single pool tag, set a value for MmSpecialPoolTag in ntoskrnl/mm/ARM3/pool.c. All allocations using this tag will be placed in Special Pool as long as PTEs are available.
  • To enable Special Pool for all allocations, set MmSpecialPoolTag to '*'.
  • To enable it for more than one tag, set MmSpecialPoolTag to anything exceept 0 or -1 and modify MmUseSpecialPool in ntoskrnl/mm/ARM3/special.c (e.g. to return TRUE for mutliple tags).

Tips & Tricks:

  • You can change the value of MmSpecialPool tag at any time — the only constraint is that it must not be 0 or -1 at boot, or Special Pool will never be initialized. Example WinDbg command: ed nt!MmSpecialPoolTag 'DCBA'

How to read/debug BugCheck messages

  • BugCheckCode parameter hex values are defined in:
\include\reactos\mc\bugcodes.mc
 or, generated .h-version for i386: \obj-i386\include\reactos\bugcodes.h
  • Some messages have useful instructions