On 28 Jan 2019, at 16:37, Neil Williams neil.williams@linaro.org wrote:
On Mon, 28 Jan 2019 at 16:11, Diego Russo Diego.Russo@arm.com wrote:
On 28 Jan 2019, at 11:20, Neil Williams neil.williams@linaro.org wrote:
On Mon, 28 Jan 2019 at 11:02, Diego Russo Diego.Russo@arm.com wrote:
Hello,
I have the following setup: a WaRP7 which exposes a network connection over USB gadget driver (http://trac.gateworks.com/wiki/linux/OTG#g_etherGadget)
As long as the device is capable of raising the network interface from a POSIX test action on the device, there is no need to even care that this is a USB anything. It's TCP/IP and that's all that any other test action needs to know.
Exactly, I’m passing the usb0 interface to the container having the following in /et/lxc/default.conf
lxc.network.1.type = phys lxc.network.1.link = usb0 lxc.network.1.name = usb0 lxc.network.1.flags = up
That contaminates EVERY LXC test job with usb0 which is never going to be acceptable. Do NOT do this, under any circumstances.
This needs to be test-job specific, i.e. defined in the device dictionary and managed via a udev rule written by LAVA, which then also covers re-adding to the LXC automatically. However, there is actually no reason to even care about USB, so the whole issue goes away.
I know this is going to affect every LXC container in the slave bu in this specific case I had a one-to-one WaRP7-slave relationship.
This means the usb0 network interfaces will be passed to the container as usb0. This works as far as the usb0 interface exists on the host.
Please think carefully and describe EXACTLY what you are aiming to do because it sounds like there is confusion here about how to interact with the device.
My aim for this specific test is:
- “Install" an application on the slave which interacts with the WaRP7
That would be better done as a custom docker image or a custom VM which interacts with the device solely over TCP/IP. Installing things takes time - better to start a pre-installed container or VM.
The device, when booted, broadcasts a MAC address and gets an IP address from DHCP. That DHCP can be configured to always give the same IP address for the same MAC address. This IP address is then defined in the device dictionary.
The device is then one node, with no LXC and no USB handling or udev rules or system-wide LXC changes.
A second node is defined which addresses the device using the IP address.
The two nodes are defined in a MultiNode test job.
There is no DHCP service involved. On the WaRP7 side the usb0 interface will be setting a link-local ipv4 and ipv6 address. The mac address of this interface is generated random at every boot, hence the link-local address changes too.
root@mbed-linux-os-1618:~# ifconfig lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
usb0 Link encap:Ethernet HWaddr C6:A3:F6:38:2C:3F inet6 addr: fe80::c4a3:f6ff:fe38:2c3f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3 errors:0 dropped:0 overruns:0 frame:0 TX packets:1 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:480 (480.0 B) TX bytes:342 (342.0 B)
usb0:avahi Link encap:Ethernet HWaddr C6:A3:F6:38:2C:3F inet addr:169.254.6.35 Bcast:169.254.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- this application is a CLI application which interacts with the WaRP7 using the usb0 interface
It sounds like you are confusing USB with usb0 and TCP/IP. The CLI application interacts with the device using TCP/IP. Thinking of that as usb leads you into the problem of adding a USB device which probably isn't even necessary.
No, I’m not confusing it, I didn’t explain very well. When I say usb0 I mean the actual network interface (see above) I have the same network interface on the host side as well.
root@mbl-lava-dispatcher-3:~# ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 08:00:27:07:4c:66 brd ff:ff:ff:ff:ff:ff 61: lxcbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 00:16:3e:00:00:00 brd ff:ff:ff:ff:ff:ff 122: usb0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 4e:2e:50:11:be:7e brd ff:ff:ff:ff:ff:ff
Note the mac address of this interfaces changes EVERY time.
Another note: on the host side, for testing purposes I’ve disable the predictable network interface name. In this way my interface name is always usb0
https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfac...
By default is enabled and the interface name might change at every reboot of the board (enp0s12u4)
Or is your CLI not using networking at all but hacking into the kernel networking stack directly? Is the CLI trying to open the usb0 interface as a USB device? Why?
My cli application uses mDNS/avahi over the network, so it doesn’t do anything special with the kernel. It’s a standard application-level binary which at the end uses sockets. Again, when I say usb0 interface I mean the network interface with usb0. We don’t even specify which interface to use as for it usb0 is yet another network interface. It works both with IPv4 and IPv6.
- Flash and boot the Warp7
- Tests are run ON the lava slave using the application installed earlier
Tests can just as easily be run in a docker image or a VM - it doesn't need to be on the lava slave at all, as long as it can see the TCP/IP address. This way, you can debug your test definitions by running the same image against a device on your desk, outside of LAVA.
For this reason I wanted to use LXC support in LAVA and just wanted to use usb0 network interface within that container.
- the application uses the usb0 interface on the slave
The application uses the IP address raised on whatever interface the device is configured to use, in this case it happens to be called usb0 but the application has no idea how it is implemented, it's standard TCP/IP.
Exactly, the application uses usb0 interface as standard TCP/IP interface.
- There are no tests running on the Warp7 but this might be rebooted while running above tests
So the application needs to be able to buffer until the device comes back on the same IP address - that's manageable via DHCP and the MAC address. Once the other node has the IP address of the device, it doesn't matter what the device does - providing the device always re-establishes the TCP/IP connection.
As I said earlier there are no DHCP server involved. Every works with link-local IP (both IPv4 and IPv6) and those change at ever reboot of the board (which is fine: we can rescan it via mdns/avahi).
The USB gadget interface is the wrong side of the interface - you already have a POSIX test action running on the device, so use that to raise and configure the TCP/IP by accessing the relevant driver support directly on the device.
On the device I don’t need to run anything. The “issue” is on the host (lava slave) side.
I disagree. The issue is the confusion of /dev/bus/usb and usb0 - along with the mistake of putting testjob specific configuration into a system-wide file.
As stated earlier, the device comes up with the right settings already: usb0 up and running with IPv4 and IPv6 local-link addresses.
A possible test case is to have some process running on the LAVA dispatcher (within a LXC container) which targets the WaRP7 over this network interface.
LXC support does not provide any means of synchronisation across test actions. Strict sequence only. If anything isn't ready, the test definition will either have to just cope with the situation or fail the entire test job.
Why are you trying to do USB device passthrough when you have this network interface? This device doesn't need an LXC to run a standard test job. https://staging.validation.linaro.org/scheduler/job/248129
Therefore, avoid using the LXC protocol in the first place and communicate over the network. You'll need to declare the IP address of the device but that's a standard MultiNode API call from a POSIX shell on the device.
The process running the test case does NOT have to be on the LAVA dispatcher if it is targetting the device over TCP/IP. All it needs is the IP address, nothing USB at all.
In our case we don’t need to run anything on the WaRP7. The WaRP7 just needs to be up and running and be visible via usb0 from the dispatcher.
The device just needs to be configured to automatically raise a network interface and get an IP address when booted. What interface that uses is completely irrelevant. This makes it trivial to test with a different kind of device or two docker images or to QEMU VMs etc.
Tests are run on lava dispatcher
Tests would be better run in a dedicated container. Quicker and easier to reproduce.
Yes, what I meant that tests are running off target. For this reason I looked into LXC containers as LAVA already supports them.
WaRP7 <—> usb0 net iface <———> usb0 net iface <—> LAVA slave
device <--> TCP/IP <--> container.
The LAVA slave does not need to have any part in this (apart from running two test jobs).
Unfortunately it has: WaRP7 doesn’t have any wired network interface and it exposes the usb0 interface over the USB power cable which is connected to the LAVA slave. The same cable is used to flash the board (via uboot ums)
This WArp7 has a physical connection via USB with the LAVA slave (from the OS point of view is a yet another network interface) and I don’t think it can be reached by other nodes.
The USB node is of no concern. The device can be booted without needing any LXC and it can be configured to raise usb0 at boot. It can be configured to get a DHCP IP address at boot.
The only thing anything outside the device needs to know about it the IP address and that is configured by allocating an address in the DHCP config of the lab.
There is no DHCP service involved: as soon as the boot is up and running this can be discovered via IPv6 with the local-link address.
Through LXC I'm able to passtrhough this interface from the host to the container and use it within the container (via /etc/lxc/default.conf)
How are you passing it through? If the device is dynamic, you must declare the board_id of the device in the device dictionary so that LAVA will create a suitable udev rule to add the re-enumerated device back to the LXC when udev sees an ADD event.
I’m passing the usb0 network interface to LXC as stated at the beginning of the email. usb0 is just.
I tried also to use the board_id but unfortunately it doesn’t have any iSerial (only usb and vendo id). Better the iSerial field is 0.
All the more reason to disregard the entire /dev/bus/usb issue and use TCP/IP as standalone.
Agreed.
If a test requires the reboot of the WaRP7, the usb0 interface disappears from the container. When the WaRP7 boots again the usb0 interface is available on the host (but not in the container).
The usb0 interface is accessible from the device and you're already running a POSIX shell in the test action on the device, so that test action needs to take care of re-establishing the network connection (and possibly re-declaring the IP address to the other node).
Again, the problem is on the container side. As soon as the board is up and running I have usb0 network interface both on WaRP7 and host. The container though loses visibility.
Things I tried or thought about:
- I tried synchronizing boots both of the WaRP7 and LXC container but it seems not possible to "reboot" (restart) a container within the same job execution.
- Is it possible to "restart" a container during a job execution?
No. This has nothing to do with the start of the LXC.
Well it does because if I restart the LXC container AFTER the board has rebooted, usb0 is re-passed through and it has visibility of this network interface.
You cannot contaminate every LXC ever run on that lava-slave with the usb0 device details - do not make changes to /etc/lxc/default.conf - that cannot scale.
Provided I don’t do that, how can I pass the enp0s12u4 to the LXC container?
- Outside LAVA it is possible to run a command (lxc-device --name diegor-test -- add usb0) which re-passthrough the interface from Linux to LXC container.
- Is it possible to run the above command ad job execution time on the lava dispatcher?
How can I solve this situation?
If you do want to do passthrough:
https://master.lavasoftware.org/static/docs/v2/admin-lxc-deploy.html#deployi...
https://lava.codehelp.co.uk/scheduler/job/4313#action_1-2
https://lava.codehelp.co.uk/scheduler/device/tom/devicedict#defline11
As said earlier, the board_id is 0.
Then passthrough is undermined by the broken hardware / firmware.
I don’t know why, but WaRP7 has been designed in this way, it doesn’t expose either an ID over the serial (even though has a FDTI chip on it).
If you want to use MultiNode, use a QEMU device as the second node which communicates with the other node using the MultiNode API.
https://master.lavasoftware.org/static/docs/v2/multinode.html
I think using the multinode won’t help for this specific case.
Cheers
-- Diego Russo Staff Software Engineer - diego.russo@arm.com Direct Tel. no: +44 1223 405920 Main Tel. no: +44 1223 400400 ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom http://www.diegor.co.uk - http://twitter.com/diegor http://www.linkedin.com/in/diegor
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
--
Neil Williams
neil.williams@linaro.org http://www.linux.codehelp.co.uk/
-- Diego Russo | Staff Software Engineer | Mbed Linux OS ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
--
Neil Williams
neil.williams@linaro.org http://www.linux.codehelp.co.uk/
-- Diego Russo | Staff Software Engineer | Mbed Linux OS ARM Ltd. CPC1, Capital Park, Cambridge Road, Fulbourn, CB21 5XE, United Kingdom http://www.diegor.co.uk - https://os.mbed.com/linux-os/
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.