|
|
Subject: Re: kernel debugging over DMA/Firewire (firescope and fireproxy) - msg#00073
List: linux.kernel.debugging.kgdb.bugs
Bernhard,
Ambitious project, I must say! Kindly keep us updated on the progress of this
project.
-Amit
On Sunday 12 Mar 2006 7:15 pm, Bernhard Kaindl wrote:
> Hi,
> this is an announce of a development which uses the Linux Firewire
> (IEEE1394) drivers to allow remote debugging of Linux machines over
> Firewire, using gdb, which is why I'm mailing this here.
>
> The following is also on http://www.suse.de/~bk/firewire/ANNOUNCE.txt
> except for the update at the end.
>
> This little project is aimed at symbolic debugging of machines which
> do not have a serial ports. Using Firewire (called i.Link by Sony),
> (when completed) it could be used like kgdboe (kgdb over ethernet
> using netpoll, no interrupts), except for one point:
>
> Target memory can be read (by gdb) and dumped even when the system is
> hung or crashed (DRAM refresh and DMA need to work though) without any
> debugger, memory dumper or kdump in operation in the crashed kernel or
> system. This works now.
>
> The origin of the project is Benjamin Herrenschmidt's tool firescope,
> which is a FireWire frontend for the xmon kernel monitoring/debugging
> tool for ppc:
>
> firescope controls a remote xmon over a FireWire cable, using FireWire's
> direct memory access (reading/writing to memory areas) for communication.
> The FreeBSD guys followed benh's example with gdb and system console
> over firewire in 2003.
>
> Physical DMA is part of the IEEE1394 specification and all OHCI-1394
> compatible controllers implement it. I do not have a FireWire card
> which is not OHCI-1394-compatible. I could not test the Texas Instruments
> PCILynx chip (have no such controller available), however.
>
> The BIOS of one of my Laptops even supports booting from Firewire and
> it allows to read at least 4k of memory even before Linux is booted.
>
> You might say that allowing physical reads/writes to arbitray addresses
> is a big security hole, but when looking closer, it always depends on
> physical access to the machine, similar to using USB packet sniffing to
> capture your keystrokes. Of course you might want to disable physical
> DMA access by reloading ohci1394 with the module option phys_dma=0.
>
> Some links for further details:
> http://en.wikipedia.org/wiki/FireWire#Security_issues
> http://lists.freebsd.org/pipermail/freebsd-security/2004-November/002475.ht
> ml http://www.derkeiler.com/Mailing-Lists/FreeBSD-Security/2004-11/0020.html
>
> <anecdote>
> On Mac OS X laptops, apparently the easyest solution was to put epoxy
> into the ports of the laptop (until it was possible to disable DMA):
> http://rentzsch.com/macosx/securingFirewire
> The default however has stayed as DMA=on, at least until this blog
> entry.</anecdote>
>
> Andi Kleen ported the xmon-independent parts of firescope to x86_64/i386
> and implemented dmesg buffer display without target cooperation.
> Using the System.map of the remote kernel, it shows you the dmesg buffer
> of the remote system by telling the remote FireWire controller to read
> the memory directly using DMA and send it to firescope over firwire.
>
> I have taken the firewire code from Andi's firescope port and added
> a GDB remote protocol backend to it (similar to the gdbstub in the kernel),
> and this is fireproxy:
>
> It proxies the gdb remote protocol to firewire (so far it allows reading
> and writing remote memory by gdb).
>
> Traditional remote debugging:
>
> debugger Program or system being debugged
>
> gdb(+frontend) gdbserver (Program debugging)
> gdbstub/kgdb (Kernel debugging)
> +-----------+ +-----------+
>
> | Machine A |<- GDB Remote Protocol ->| Machine B |
> | gdb | (serial, Ethernet) | gdbstub |
>
> +-----------+ +-----------+
>
> Fireproxy (when complete) provides the same functionality, but the machine
> being debugged is not machine B, but machine C:
>
> gdb(+frontend fireproxy Final target
> +-----------+ +-----------+ +-----------+
>
> | Machine A |<- GDB Remote Protocol ->| Machine B |<---->| Machine C |
> | Connects | using TCP/IP network | TCP Port | IEEE | (gdbstub) |
>
> +-----------+ or tunnel to +-----------+ 1394 +-----------+
>
> Naturally, one can achieve similar results by using remote login
> from Machine A to Machine B and running the debugger (and its frontend)
> directly on Machine B.
>
> This only works when the ohci1394 driver has enabled DMA, so
> an early ohci1394 init would be needed for early boot debugging.
> (or the hanging kernel is started using kexec)
>
> Quickstart:
>
> 1) Copy vmlinux.debug from Machine C's kernel to Machine A and
>
> 2) Machine C: modprobe ohci1394 (if not loaded already)
>
> 3) On Machine B, after unpacking the tarball, run:
>
> # make && ./fireproxy System.map-from-MachineC (if you have)
>
> Sample output:
>
> Loaded system.map <../System.map> <879897> bytes
> 2 nodes available, local node is: 1
> 0: ffc0, uuid: 00080100 fa360220
> 1: ffc1, uuid: 00080100 cc8b0120 [LOCAL]
> pick a target node: not a ppc
> utsname addr: ffffffff80323ca0
> Attached to node 'f229'
> System : x86_64
> Version: 2.6.16-rc5-git2-3-default (#1 Tue Feb 28 09:16:17 UTC 2006)
> Target : ffc0
> Gen : 3
> Ready to accept on port 4
>
> * On Machine A:
>
> gdb /path/to/vmlinux/with/debuginfo
> ...
> (gdb) target remote Machine_B.somewhere.net:4
>
> getting this message is normal:
> 0x0000000000000000 in ?? ()
>
> Dumping the printk buffer to a file is possible (if it has
> not wrapped yet) with:
>
> (gdb) dump binary memory dmesg.out log_buf (log_buf+log_end)
>
> You could get also get a full memory dump this way and analyze
> it using lcrash (lkcdutils) or RH's crash tool. (Note: There is
> a memory leak somewhere, so this may have to be fixed first..)
>
> You can use usual commands like:
> (gdb) p system_utsname.release
>
> The included .gdbinit expects the vmlinux file at ../vmlinux and
> has a few useful macros which do symbolic stack backtraces of the
> tasks on the system. Here is an example:
>
> (kgdb) btpid 5507
> looking for task_struct of pid 5507...
> This may take a few seconds or up to a few minutes (with many tasks)
>
> pid 5507 - nautilus:
> -------------------
> 803203a0 init_task in section .data
> 8012fba5 __mod_timer + 169 in section .text
> 802bc8e2 schedule_timeout + 150 in section .text
> 803df758 per_cpu__tvec_bases + 280 in section .bss
> 803df958 per_cpu__tvec_bases + 792 in section .bss
> 8012f6c5 process_timeout in section .text
> 803df640 per_cpu__tvec_bases in section .bss
> 8017ad10 do_sys_poll + 629 in section .text
> 8017a257 __pollwait in section .text
> 8017ae0c sys_poll + 58 in section .text
> 8010a5e8 tracesys + 209 in section .text
>
> looking for task_struct of pid 1...
> This may take a few seconds or up to a few minutes (with many tasks)
> (kgdb)
>
> It could work better if I had implemented thread support in
> fireproxy (or be able to use the thread support in the kernel
> gdbstub) already, which would show Linux processes like threads
> in GDB but bear with me a little, I have started coding only
> a few days ago.
>
> Of course, the goal is to control a remote gdbstub in the same
> way as if you'd have a serial connection, just at 400,000,000 bps :)
>
> The next goal is to make fireproxy talk with a kernel-gdbstub,
> which would mean that you could get CPU backtraces, breakpoints,
> watchpoints and single-stepping.
>
> I guess that such communication stub may be even possible on top
> of kdb, even mostly as a loadable module so for quick development
> of the stub. The next, easyer goal would be to use the same module
> interface which kgdboe is using with kgdb, and implement communication
> with fireproxy using memory areas.
>
> To implement the "Ctrl-C" function to enter the remote kernel debugger
> by a remote signal, the ohci1394 driver could subsequently be modified
> to pass control to the kernel debugging stub when it is receives an
> interrupt caused by a unique packet.
>
> A detailed README which talks about all current issues and caveats
> too (read it if you want to want to try it now) is at:
>
> http://www.suse.de/~bk/firewire/README.txt
>
> You can download the latest tarball using this directory listing:
> http://www.suse.de/~bk/firewire/
>
> There is quite quite some room for improvement (possible improvements
> documented in the README), but since it can be already used to debug
> some real problems, I wanted to make the tool known.
>
> If you have ideas or patches for improvement, they will of course
> be appreciated.
>
> If other developers are interrested in joining development, a project
> on some open source software development management system could be
> opened to have a common repository. Please send me a mail if you
> like to get involved.
>
> PS: FireWire is a trademark of Apple, Sony uses the name i.Link for the
> same bus and the generic name is IEEE 1394 and just 1394
> (thirteen-ninetyfour) when talking about it. FireWire is just the most
> popular name. Much of techical information uses IEEE 1394 to refer to the
> standard.
>
> PPS: Good information is in: http://en.wikipedia.org/wiki/FireWire
> ----------------------------------------------------------------------
>
> (The following is NOT YET on http://www.suse.de/~bk/firewire/ANNOUNCE.txt):
>
> Update:
>
> I have a simple, but designed to be fast protocol for talking with
> kgdb (sort-of) running, it's based on kdbgoe (kgdb on Ethernet) and
> is still in early debugging state. I'm planning to name it kgdbom
> (kgdb on memory) since it is a generic communication module for kgdb
> communication using memory areas.
> There may be other interfaces which could allow DMA too and it would
> be usable with any such interface, because it only talks on memory.
>
> I'm able to sent simple commands like detach and kill (remove all
> breakpoints and exit kgdb) from fireproxy to the kernel gdbstub,
> and they are executed there. The next, final steps would be to try
> to forward gdb's commands to the kernel and to forward the answers
> from the kernel gdbstub back to gdb. Then, the communcation is
> running.
>
> NMI:
>
> To allow stopping a remote running machine, not by Sysrq-G and not
> by a pre-set break- or watchpoint, but by firewire, one would have
> to trigger an interrupt, but normal interrupts are disabled if the
> kernel is looping with interrupts disabled, only the NMI (non-maskable
> interrupt) on PCs is still handled in this state, and such interrupt
> would be needed in situaitons where all other interrupts are disabled,
> e.g. when the kernel is crashed or hung in such state. Such NMI could
> be used to wake up the machine and let it jump into gdb.
>
> I was in fact able to reproduce to trigger an NMI by
> reading some memory area on a Samsung X20 Laptop.
>
> I suspect that I may have hit some I/O or memory area of some
> PCI device or the Intel ICH6 chipset by these reads and that
> this device or the chipset itself triggered the NMI.
>
> I assume this because Maximillian Dornseif, a security lecturer,
> mentioned that in a talk on Firewire that accessing memory ranges
> which are provided by PCI devices is apparently not always liked
> by these devices and may cause crashes:
> http://md.hudora.de/presentations/##forensics2004-firewire
>
> Only the first such read was able to trigger the NMI,
> subsequent tries (without rebooting) did not yield an NMI.
> However, it is possible to recreate that at will, even if
> only oncem, it would allow debugging of crashes which would
> require manipulation of hardware otherwise.
>
> If we manage to find out how to generically trigger such NMI
> by DMA accesses from Firewire, it would allow us to even
> implement the "Ctrl-C" functionality, to drop a running
> machine into kgdb, caused by pressing Ctrl-C in gdb.
>
> Some variants of BIOS firmware from Phoenix have support for boot
> from Firewire (on Laptops such as the X20), so at least after
> enabling it, I was able to "see" the initialized controller
> even before Linux is booted, and an read access to low memory
> triggered a PCI interrupt which was shown by the grub bootloader.
> This interrupt may however have been programmed by the BIOS for
> the boot from Firewire.
>
> When the protocol is running on both sides (kernel
> and fireproxy) and doing useful work, I'll send an
> update here.
>
> For the curious, I attached snapshot of the new kgdb I/O module
> which has some with debug printks and will certainly undergo some
> furhter changes.
>
> If you are interested further, please contact me.
>
> - Bernhard
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
ia64 smp softlockup detection
Hi Bob and all,
It seems that on ia64, in smp, we hit the softlockup watchdog.
I saw this was discussed in january on the mailing list, and there is
support in the kgdb stub.
But it seems that the deactivation of the softlockup is only done for
the cpu that enters kgdb, but not on the others cpus that are waiting
with interrupts disabled in kgdb_wait().
Setting kgdb_sync_softlockup flag to 1 fixes the problem.
--- linux-2.6.15/kernel/kgdb.c 2006-03-13 18:03:35.000000000 +0100
+++ linux-2.6.15-new/kernel/kgdb.c 2006-03-13 17:25:43.000000000 +0100
@@ -745,6 +745,7 @@ static void kgdb_wait(struct pt_regs *re
kgdb_info[processor].debuggerinfo = regs;
kgdb_info[processor].task = current;
atomic_set(&procindebug[processor], 1);
+ atomic_set(&kgdb_sync_softlockup[smp_processor_id()],1);
/* Wait till master processor goes completely into the debugger.
* FIXME: this looks racy */
traces
--------------------------------------------------------------------------------------------------------------------------
[root@pf37a ~]# BUG: soft lockup detected on CPU#0!
Modules linked in: kgdboe dm_mod thermal processor fan button usbhid
uhci_hcd ehci_hcd usbcore e100 ide_floppy sg aic7xxx scsi_transport_spi
ext3 jbd mptspi mptscsih mptbase sd_mod scsi_mod
Pid: 0, CPU 0, comm: swapper
psr : 00001010091a6018 ifs : 8000000000000209 ip :
[<a000000100011a80>] Not tainted
ip is at default_idle+0x100/0x160
unat: 0000000000000000 pfs : 0000000000000307 rsc : 0000000000000003
rnat: 00000000000fa976 bsps: 000000000001003e pr : 10a282c800c1559b
ldrs: 0000000000000000 ccv : 0000000000010000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a000000100012130 b6 : a000000100102d80 b7 : a000000100011980
f6 : 1003e0000000000000000 f7 : 1003e20c49ba5e353f7cf
f8 : 000000000000000000000 f9 : 100068000000000000000
f10 : 1003e0000000000000000 f11 : 1003e0000000000000000
r1 : a000000100ad1a40 r2 : 0000000000000000 r3 : a0000001008e3710
r8 : 0000000000000000 r9 : 0000000000010000 r10 : a0000001008e3710
r11 : a0000001006f0d70 r12 : a0000001006ffd80 r13 : a0000001006f0000
r14 : 000000000000000e r15 : a0000001006ffdc8 r16 : 0000000000000f22
r17 : 0000000000010000 r18 : a0000001006f0d70 r19 : 0000000000010000
r20 : 0000000000010000 r21 : e000000004b00000 r22 : 0000000000000000
r23 : e000000004b10000 r24 : ffffffffffff0000 r25 : ffffffffffff0028
r26 : 0000000000000000 r27 : 00000000ffffffff r28 : a0000001006730f8
r29 : a000000100011980 r30 : a0000001008e3f88 r31 : a0000001008e3f88
Call Trace:
[<a000000100010f30>] show_stack+0x50/0xa0
sp=a0000001006ff9d0 bsp=a0000001006f1138
[<a000000100011800>] show_regs+0x820/0x840
sp=a0000001006ffba0 bsp=a0000001006f10f0
[<a0000001000e7d10>] softlockup_tick+0x150/0x180
sp=a0000001006ffba0 bsp=a0000001006f10c0
[<a0000001000a9db0>] do_timer+0xa30/0xa60
sp=a0000001006ffbb0 bsp=a0000001006f1068
[<a000000100034270>] timer_interrupt+0x1b0/0x300
sp=a0000001006ffbb0 bsp=a0000001006f1028
[<a0000001000e81f0>] handle_IRQ_event+0x90/0x120
sp=a0000001006ffbb0 bsp=a0000001006f0fe0
[<a0000001000e83d0>] __do_IRQ+0x150/0x3e0
sp=a0000001006ffbb0 bsp=a0000001006f0f90
[<a000000100010140>] ia64_handle_irq+0xa0/0x140
sp=a0000001006ffbb0 bsp=a0000001006f0f68
[<a00000010000bbc0>] ia64_leave_kernel+0x0/0x280
sp=a0000001006ffbb0 bsp=a0000001006f0f68
[<a000000100011a80>] default_idle+0x100/0x160
sp=a0000001006ffd80 bsp=a0000001006f0f20
[<a000000100012130>] cpu_idle+0x230/0x300
sp=a0000001006ffe20 bsp=a0000001006f0ef0
[<a000000100009210>] rest_init+0x70/0xa0
sp=a0000001006ffe20 bsp=a0000001006f0ed8
[<a000000100690e40>] start_kernel+0x500/0x520
sp=a0000001006ffe20 bsp=a0000001006f0e70
[<a000000100008650>] __end_ivt_text+0x330/0x350
sp=a000
xavier.bru.vcf
Description: Vcard
Next Message by Date:
click to view message preview
Re: kgdboe on multi-ethernet or SMP systems?
I wouldn't think this has something to do with SMP. multi-interfaced part may
have some interaction.
-Amit
On Tuesday 14 Mar 2006 12:27 am, Tom Rini wrote:
> On Mon, Mar 13, 2006 at 11:10:03AM -0700, Tom Rini wrote:
> > Has anyone tried current kgdb on either an SMP system or multi-interfaced
> > system? I'm seeing some oddness (easily reproduced unhappines) with SMP
> > and multi interfaces, but I'm not sure which is the culprit. This is
> > powerpc64 and just connect / break sys_sync / continue, both from module
> > load or boot.
>
> UP is a little happier, it doesn't fail, but the first packet sent after
> continue needs to be resent 3 times.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Previous Message by Thread:
click to view message preview
kernel debugging over DMA/Firewire (firescope and fireproxy)
Hi,
this is an announce of a development which uses the Linux Firewire
(IEEE1394) drivers to allow remote debugging of Linux machines over
Firewire, using gdb, which is why I'm mailing this here.
The following is also on http://www.suse.de/~bk/firewire/ANNOUNCE.txt
except for the update at the end.
This little project is aimed at symbolic debugging of machines which
do not have a serial ports. Using Firewire (called i.Link by Sony),
(when completed) it could be used like kgdboe (kgdb over ethernet
using netpoll, no interrupts), except for one point:
Target memory can be read (by gdb) and dumped even when the system is
hung or crashed (DRAM refresh and DMA need to work though) without any
debugger, memory dumper or kdump in operation in the crashed kernel or
system. This works now.
The origin of the project is Benjamin Herrenschmidt's tool firescope,
which is a FireWire frontend for the xmon kernel monitoring/debugging
tool for ppc:
firescope controls a remote xmon over a FireWire cable, using FireWire's
direct memory access (reading/writing to memory areas) for communication.
The FreeBSD guys followed benh's example with gdb and system console
over firewire in 2003.
Physical DMA is part of the IEEE1394 specification and all OHCI-1394
compatible controllers implement it. I do not have a FireWire card
which is not OHCI-1394-compatible. I could not test the Texas Instruments
PCILynx chip (have no such controller available), however.
The BIOS of one of my Laptops even supports booting from Firewire and
it allows to read at least 4k of memory even before Linux is booted.
You might say that allowing physical reads/writes to arbitray addresses
is a big security hole, but when looking closer, it always depends on
physical access to the machine, similar to using USB packet sniffing to
capture your keystrokes. Of course you might want to disable physical
DMA access by reloading ohci1394 with the module option phys_dma=0.
Some links for further details:
http://en.wikipedia.org/wiki/FireWire#Security_issues
http://lists.freebsd.org/pipermail/freebsd-security/2004-November/002475.html
http://www.derkeiler.com/Mailing-Lists/FreeBSD-Security/2004-11/0020.html
<anecdote>
On Mac OS X laptops, apparently the easyest solution was to put epoxy
into the ports of the laptop (until it was possible to disable DMA):
http://rentzsch.com/macosx/securingFirewire
The default however has stayed as DMA=on, at least until this blog
entry.</anecdote>
Andi Kleen ported the xmon-independent parts of firescope to x86_64/i386
and implemented dmesg buffer display without target cooperation.
Using the System.map of the remote kernel, it shows you the dmesg buffer
of the remote system by telling the remote FireWire controller to read
the memory directly using DMA and send it to firescope over firwire.
I have taken the firewire code from Andi's firescope port and added
a GDB remote protocol backend to it (similar to the gdbstub in the kernel),
and this is fireproxy:
It proxies the gdb remote protocol to firewire (so far it allows reading
and writing remote memory by gdb).
Traditional remote debugging:
debugger Program or system being debugged
gdb(+frontend) gdbserver (Program debugging)
gdbstub/kgdb (Kernel debugging)
+-----------+ +-----------+
| | | |
| Machine A |<- GDB Remote Protocol ->| Machine B |
| gdb | (serial, Ethernet) | gdbstub |
+-----------+ +-----------+
Fireproxy (when complete) provides the same functionality, but the machine
being debugged is not machine B, but machine C:
gdb(+frontend fireproxy Final target
+-----------+ +-----------+ +-----------+
| | | | | |
| Machine A |<- GDB Remote Protocol ->| Machine B |<---->| Machine C |
| Connects | using TCP/IP network | TCP Port | IEEE | (gdbstub) |
+-----------+ or tunnel to +-----------+ 1394 +-----------+
Naturally, one can achieve similar results by using remote login
from Machine A to Machine B and running the debugger (and its frontend)
directly on Machine B.
This only works when the ohci1394 driver has enabled DMA, so
an early ohci1394 init would be needed for early boot debugging.
(or the hanging kernel is started using kexec)
Quickstart:
1) Copy vmlinux.debug from Machine C's kernel to Machine A and
2) Machine C: modprobe ohci1394 (if not loaded already)
3) On Machine B, after unpacking the tarball, run:
# make && ./fireproxy System.map-from-MachineC (if you have)
Sample output:
Loaded system.map <../System.map> <879897> bytes
2 nodes available, local node is: 1
0: ffc0, uuid: 00080100 fa360220
1: ffc1, uuid: 00080100 cc8b0120 [LOCAL]
pick a target node: not a ppc
utsname addr: ffffffff80323ca0
Attached to node 'f229'
System : x86_64
Version: 2.6.16-rc5-git2-3-default (#1 Tue Feb 28 09:16:17 UTC 2006)
Target : ffc0
Gen : 3
Ready to accept on port 4
* On Machine A:
gdb /path/to/vmlinux/with/debuginfo
...
(gdb) target remote Machine_B.somewhere.net:4
getting this message is normal:
0x0000000000000000 in ?? ()
Dumping the printk buffer to a file is possible (if it has
not wrapped yet) with:
(gdb) dump binary memory dmesg.out log_buf (log_buf+log_end)
You could get also get a full memory dump this way and analyze
it using lcrash (lkcdutils) or RH's crash tool. (Note: There is
a memory leak somewhere, so this may have to be fixed first..)
You can use usual commands like:
(gdb) p system_utsname.release
The included .gdbinit expects the vmlinux file at ../vmlinux and
has a few useful macros which do symbolic stack backtraces of the
tasks on the system. Here is an example:
(kgdb) btpid 5507
looking for task_struct of pid 5507...
This may take a few seconds or up to a few minutes (with many tasks)
pid 5507 - nautilus:
-------------------
803203a0 init_task in section .data
8012fba5 __mod_timer + 169 in section .text
802bc8e2 schedule_timeout + 150 in section .text
803df758 per_cpu__tvec_bases + 280 in section .bss
803df958 per_cpu__tvec_bases + 792 in section .bss
8012f6c5 process_timeout in section .text
803df640 per_cpu__tvec_bases in section .bss
8017ad10 do_sys_poll + 629 in section .text
8017a257 __pollwait in section .text
8017ae0c sys_poll + 58 in section .text
8010a5e8 tracesys + 209 in section .text
looking for task_struct of pid 1...
This may take a few seconds or up to a few minutes (with many tasks)
(kgdb)
It could work better if I had implemented thread support in
fireproxy (or be able to use the thread support in the kernel
gdbstub) already, which would show Linux processes like threads
in GDB but bear with me a little, I have started coding only
a few days ago.
Of course, the goal is to control a remote gdbstub in the same
way as if you'd have a serial connection, just at 400,000,000 bps :)
The next goal is to make fireproxy talk with a kernel-gdbstub,
which would mean that you could get CPU backtraces, breakpoints,
watchpoints and single-stepping.
I guess that such communication stub may be even possible on top
of kdb, even mostly as a loadable module so for quick development
of the stub. The next, easyer goal would be to use the same module
interface which kgdboe is using with kgdb, and implement communication
with fireproxy using memory areas.
To implement the "Ctrl-C" function to enter the remote kernel debugger
by a remote signal, the ohci1394 driver could subsequently be modified
to pass control to the kernel debugging stub when it is receives an
interrupt caused by a unique packet.
A detailed README which talks about all current issues and caveats
too (read it if you want to want to try it now) is at:
http://www.suse.de/~bk/firewire/README.txt
You can download the latest tarball using this directory listing:
http://www.suse.de/~bk/firewire/
There is quite quite some room for improvement (possible improvements
documented in the README), but since it can be already used to debug
some real problems, I wanted to make the tool known.
If you have ideas or patches for improvement, they will of course
be appreciated.
If other developers are interrested in joining development, a project
on some open source software development management system could be
opened to have a common repository. Please send me a mail if you
like to get involved.
PS: FireWire is a trademark of Apple, Sony uses the name i.Link for the
same bus and the generic name is IEEE 1394 and just 1394 (thirteen-ninetyfour)
when talking about it. FireWire is just the most popular name. Much of
techical information uses IEEE 1394 to refer to the standard.
PPS: Good information is in: http://en.wikipedia.org/wiki/FireWire
----------------------------------------------------------------------
(The following is NOT YET on http://www.suse.de/~bk/firewire/ANNOUNCE.txt):
Update:
I have a simple, but designed to be fast protocol for talking with
kgdb (sort-of) running, it's based on kdbgoe (kgdb on Ethernet) and
is still in early debugging state. I'm planning to name it kgdbom
(kgdb on memory) since it is a generic communication module for kgdb
communication using memory areas.
There may be other interfaces which could allow DMA too and it would
be usable with any such interface, because it only talks on memory.
I'm able to sent simple commands like detach and kill (remove all
breakpoints and exit kgdb) from fireproxy to the kernel gdbstub,
and they are executed there. The next, final steps would be to try
to forward gdb's commands to the kernel and to forward the answers
from the kernel gdbstub back to gdb. Then, the communcation is
running.
NMI:
To allow stopping a remote running machine, not by Sysrq-G and not
by a pre-set break- or watchpoint, but by firewire, one would have
to trigger an interrupt, but normal interrupts are disabled if the
kernel is looping with interrupts disabled, only the NMI (non-maskable
interrupt) on PCs is still handled in this state, and such interrupt
would be needed in situaitons where all other interrupts are disabled,
e.g. when the kernel is crashed or hung in such state. Such NMI could
be used to wake up the machine and let it jump into gdb.
I was in fact able to reproduce to trigger an NMI by
reading some memory area on a Samsung X20 Laptop.
I suspect that I may have hit some I/O or memory area of some
PCI device or the Intel ICH6 chipset by these reads and that
this device or the chipset itself triggered the NMI.
I assume this because Maximillian Dornseif, a security lecturer,
mentioned that in a talk on Firewire that accessing memory ranges
which are provided by PCI devices is apparently not always liked
by these devices and may cause crashes:
http://md.hudora.de/presentations/##forensics2004-firewire
Only the first such read was able to trigger the NMI,
subsequent tries (without rebooting) did not yield an NMI.
However, it is possible to recreate that at will, even if
only oncem, it would allow debugging of crashes which would
require manipulation of hardware otherwise.
If we manage to find out how to generically trigger such NMI
by DMA accesses from Firewire, it would allow us to even
implement the "Ctrl-C" functionality, to drop a running
machine into kgdb, caused by pressing Ctrl-C in gdb.
Some variants of BIOS firmware from Phoenix have support for boot
from Firewire (on Laptops such as the X20), so at least after
enabling it, I was able to "see" the initialized controller
even before Linux is booted, and an read access to low memory
triggered a PCI interrupt which was shown by the grub bootloader.
This interrupt may however have been programmed by the BIOS for
the boot from Firewire.
When the protocol is running on both sides (kernel
and fireproxy) and doing useful work, I'll send an
update here.
For the curious, I attached snapshot of the new kgdb I/O module
which has some with debug printks and will certainly undergo some
furhter changes.
If you are interested further, please contact me.
- Bernhard
kgdbom.c
Description: Text document
Next Message by Thread:
click to view message preview
kgdboe on multi-ethernet or SMP systems?
Has anyone tried current kgdb on either an SMP system or multi-interfaced
system? I'm seeing some oddness (easily reproduced unhappines) with SMP
and multi interfaces, but I'm not sure which is the culprit. This is
powerpc64 and just connect / break sys_sync / continue, both from module
load or boot.
More details later I hope.
--
Tom Rini
http://gate.crashing.org/~trini/
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
|
|