Troubleshooting tips

Issues with main WALT service

Most WALT features are handled on server-side by various systemd services. The main service is called walt-server.

In case of problems with this service, walt command line tool most often prints the following error message:

$ walt node show
Network connection to WalT server failed!
$

The main service must listen are TCP ports 12345 and 12347 to handle walt client requests. Thus this message most often means this main service is down.

You can verify this by running:

root@walt-server:~$ systemctl status walt-server

And you can check the systemd journal for this service by running:

root@walt-server:~$ journalctl -au walt-server

Usually the issue is minor, and you can restart the service by using:

root@walt-server:~$ systemctl restart walt-server

If this fails, then the issue is most often due to another server OS issue. See next section.

Issues with other OS services

In order to list OS services which failed to start, run:

root@walt-server:~$ systemctl list-units --failed

Then you can check systemd journal for a given failing service by typing:

root@walt-server:~$ journalctl -au <failing-service>

Issues with nodes

Issues with virtual nodes are easier to troubleshoot (compared to physical nodes), because you can use walt node console to get early boot messages. See walt help show node-console.

For this reason, when developing a new walt image or making low level OS changes, it is recommended to test on a virtual node first, if possible (virtual nodes have a pc-x86-64 architecture, thus the image must be compatible with this architecture).

For debugging early bootup problems on other architectures, you will need a serial connection. For raspberry boards, see walt help show rpi-serial.

Reporting and asking for help

You can report new issues at https://github.com/drakkar-lig/walt-python-packages/issues. If you have subscribed to walt-users mailing list, you can also send a message there. You can also ask for help by sending an email to walt dev team: walt-contact at univ-grenoble-alpes.fr.

Try to include any relevant diagnosis data. If the main service is working properly or you could restart it successfully, you can dump its log data to a file using:

$ walt log show --server --history -1h: > server.log

When the main service is working properly, this is usually more reliable than using journalctl. This example will dump all log data about the previous hour. You can obviously adjust parameter -1h: to your case.