Mittwoch, 25. November 2009

PXE Odyssey

Booting via network is a nice thing, when it works. I tried to boot my laptop (which didn't recognize my CDRW) with the help of another Windows machine to finally install a Linux distribution from the Internet. At first, the best solution wasn't obvious. You'll need a few server apps which are all nicely integrated into a program which goes by the name tftpd32. Next prerequisites are some files (bootloader, installer...) which will be actually loaded into the target machine. Just follow the instructions of this excellent post.

I had a few problems, though. Unfortunally, PXE error messages aren't very enlightening. At best you'll be able to find out to which server application the error message is referring to. For example I got that one:

PXE-E53: No boot filename received

I don't know why it happened. I guess it simply asked the wrong DHCP server (although I saw that tftpd32 was correctly responding). As soon as I removed my DSL router, I got another error:

PXE-E51: No DHCP or proxyDHCP offers were received

Guess this is another DHCP problem :-). He claims that he didn't receive an offer from tftpd32, and he's a damn liar. tftpd32 told me otherwise. I struggled around for a half an hour and got it finally done. I read something on the net about Full-Duplex problems and so on... well, nonsense. Solution was to go back to basic. I put both machines on a switch, configured the windows network adapter manually just with 192.168.0.1 / 255.255.255.0, adjusted "IP pool starting address" in tftpd32 to 192.168.0.2, made the default router 192.168.0.1 and it worked. Well, DHCP worked, but not TFTP:

PXE-E32: TFTP open timeout

tftpd32 told me this instead:

Rcvd DHCP Discover Msg for IP 0.0.0.0, Mac 00:00:39:59:11:B4 [24/11 22:14:37.161]
DHCP: proposed address 192.168.0.1 [24/11 22:14:37.161]
2868 Request 2 not processed [24/11 22:14:37.239]
Rcvd DHCP Rqst Msg for IP 0.0.0.0, Mac 00:00:39:59:11:B4 [24/11 22:14:38.145]
Previously allocated address 192.168.0.1 acked [24/11 22:14:38.145]
Connection received from 192.168.0.1 on port 2070 [24/11 22:14:38.161]
Read request for file <pxelinux.0>. Mode octet [24/11 22:14:38.161]
OACK: <blksize=1456,> [24/11 22:14:38.161]
Using local port 55153 [24/11 22:14:38.161]
2868 Request 2 not processed [24/11 22:14:38.208]
File <pxelinux.0> : error 10054 in system call recv An existing connection was forcibly closed by the remote host. [24/11 22:14:38.270]

Okay. That is, he tries to read 1456 bytes from pxelinux.0 and closes the connection. Something went wrong at the remote machine. The laptop doesn't remember and simply says "timeout". So I tried lots of things. Shortened the directory names (C:\pxe\...), tried other pxelinux.0 files and so on.

Nothing changed, till I deleted everything except tftpd32.exe/.ini and pxelinux.0 and suddenly the laptop complained about several missing files (e.g. pxelinux.cfg/default). So I added step-by-step the remaining files and it worked. It loaded all files necessary for booting the debian installer. I switched back cables and re-configured network connections (also in the installer) and could complete the installation. Happy end with PXE.