Hello,

by default, lava-coordinator is expected to run on localhost. So the second device will connect to localhost instead of the real host hosting coordinator. Change the host into /etc/lava-coordinator/lava-coordinator.conf

Le mar. 29 juin 2021 à 07:19, Hedy Lamarr <lamarrhedy97@gmail.com> a écrit :
The "ssh" device will ssh to another machine which is not on the same dispatcher to start a iperf server.
The "dragonboard-410c" device will start a docker container, then in this container, it will call iperf client to connect to the iperf server.

1. In lava admin page, I link both "ssh device" & " dragonboard-410c" to the same worker.
2. But, the command in ssh(iperf server) will run on another machine, while command in docker container(iperf client) run on the same machine of worker I think.
I'm not sure you mean 1 or 2?

On Mon, Jun 28, 2021 at 10:34 PM Remi Duraffort <remi.duraffort@linaro.org> wrote:
Hello,

Le lun. 28 juin 2021 à 08:26, Hedy Lamarr <lamarrhedy97@gmail.com> a écrit :
Hello,

What additional I need to afford to debug this issue?

Thanks,
Hedy Lamarr


On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr <lamarrhedy97@gmail.com> wrote:
YES, to make it clear, I restart the lava server just now and give you a full log when that multinode job run:

2021-06-17 09:16:07,428    INFO [INIT] LAVA coordinator has started.
2021-06-17 09:16:07,757    INFO [INIT] Version 2021.03
2021-06-17 09:16:07,757    INFO [INIT] Loading configuration from /etc/lava-coordinator/lava-coordinator.conf
2021-06-17 09:16:08,076    INFO [BTSP] binding to 0.0.0.0:3079
2021-06-17 09:16:08,076    INFO Ready to accept new connections
2021-06-17 09:17:23,603    INFO The decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes.
2021-06-17 09:17:23,603    INFO Waiting for 1 more clients to connect to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group
2021-06-17 09:17:23,603    INFO Ready to accept new connections
2021-06-17 09:17:23,790    INFO Group complete, starting tests
2021-06-17 09:17:23,790    INFO Ready to accept new connections
2021-06-17 09:17:26,613    INFO Group complete, starting tests
2021-06-17 09:17:26,613    INFO Ready to accept new connections
2021-06-17 09:18:03,522   DEBUG clear Group Data: 1 of 2
2021-06-17 09:18:03,522    INFO Ready to accept new connections
2021-06-17 09:18:06,001   DEBUG clear Group Data: 2 of 2
2021-06-17 09:18:06,001   DEBUG Clearing group data for decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7
2021-06-17 09:18:06,001    INFO Ready to accept new connections
2021-06-17 09:24:43,620    INFO The 8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes.
2021-06-17 09:24:43,620    INFO Waiting for 1 more clients to connect to 8956d8e7-1097-43e0-95dd-7afc61b2908b group
2021-06-17 09:24:43,620    INFO Ready to accept new connections
2021-06-17 09:24:43,871    INFO Group complete, starting tests
2021-06-17 09:24:43,871    INFO Ready to accept new connections
2021-06-17 09:24:46,634    INFO Group complete, starting tests
2021-06-17 09:24:46,634    INFO Ready to accept new connections
2021-06-17 09:25:45,746    INFO lava_send: {'port': 3079, 'blocksize': 4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1', 'client_name': '3077', 'group_name': '8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request': 'lava_send', 'messageID': 'server_ready', 'message': {}}
2021-06-17 09:25:45,747    INFO lavaSend handler in Coordinator received a messageID 'server_ready' for group '8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077
2021-06-17 09:25:45,747   DEBUG message ID server_ready {"3077": {}} for 3077
2021-06-17 09:25:45,747   DEBUG broadcast ID server_ready {"3077": {}} for 3076
2021-06-17 09:25:45,747   DEBUG broadcast ID server_ready {"3077": {}} for 3077
2021-06-17 09:25:45,747    INFO Ready to accept new connections

This log similar to the log I saw on web, the "SSH device" with lava_send looks ok, but "dragonboard device for android test" with lava_wait looks not ok, it's just hung. From above log,  looks the coordinator did not receive anything?

For what I see in the logs, lava-coordinator is not receiving any signal from the second test.

Are both devices on the same dispatcher/worker?
 
 
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort <remi.duraffort@linaro.org> wrote:


Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr <lamarrhedy97@gmail.com> a écrit :
The output is:

service lava-coordinator status
● lava-coordinator.service - LAVA coordinator
   Loaded: loaded (/lib/systemd/system/lava-coordinator.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 weeks 5 days ago
 Main PID: 629 (lava-coordinato)
    Tasks: 1 (limit: 4915)
   Memory: 7.4M
   CGroup: /system.slice/lava-coordinator.service
           └─629 /usr/bin/python3 /usr/bin/lava-coordinator --loglevel DEBUG

So it's working.

Is it listening on 10.191.253.109:3079 ?
Do you have anything in the lava-coordinator logs? (/var/log/lava-coordinator.log)



On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort <remi.duraffort@linaro.org> wrote:


Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr <lamarrhedy97@gmail.com> a écrit :
Hello Remi,

I think lava-coordinator is running.

Because there are 2 devices here:
Device1: dragonboard-410c, when lava-wait server_ready, it hangs with above log.
Device2: ssh, when lava-send server_ready, it shows: Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.

Would it be possible that lava-coordinator just works for ssh, but not for dragonboard-410c?
Also I think the netstat, it shows:
tcp        0      0 0.0.0.0:3079            0.0.0.0:*               LISTEN      629/python3          off (0.00/0/0)
Does this mean coordinator running? Or how can I make sure coordinator running?

service lava-coordinator status
 

Thanks,
Hedy Lamarr

On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort <remi.duraffort@linaro.org> wrote:
Hello,

do you have lava-coordinator running?

Le lun. 14 juin 2021 à 14:29, Hedy Lamarr <lamarrhedy97@gmail.com> a écrit :
By the way, we use 2021.03.post1.

On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr <lamarrhedy97@gmail.com> wrote:
Dear community,

We are new to lava and try to use lava in our android test. We have issues when test iperf.

Job:

job_name: android iperf test
timeouts:
  job:
    minutes: 10080
  action:
    minutes: 120
  connection:
    minutes: 5
priority: medium
visibility: public
protocols:
  lava-multinode:
    roles:
      device:
        count: 1
        device_type: dragonboard-410c
        timeout:
          minutes: 5
      host:
        count: 1
        device_type: ssh
        timeout:
          minutes: 5
        context:
          ssh_host: localhost
          ssh_user: root
          ssh_port: 22
          ssh_identity_file: /root/.ssh/id_rsa
actions:
- deploy:
    role:
    - host
    timeout:
      minutes: 2
    to: ssh
    os: debian
- boot:
    role:
    - host
    method: ssh
    connection: ssh
    prompts:
      - '@labpc1'
- test:
    role:
    - host
    timeout:
      minutes: 120
    definitions:
    - from: inline
      name: smoke-case
      path: inline/test.yaml
      repository:
        metadata:
          format: Lava-Test Test Definition
          name: smoke
          description: Run smoke case
        run:
          steps:
          - sleep 60
          - lava-send "server_ready"
          - iperf -s -V -P 1
- test:
    role:
    - device
    definitions:
    - from: inline
      name: cts_cts-media_test
      path: inline/cts_cts-media_test.yaml
      repository:
        metadata:
          description: cts cts-media test run
          format: Lava-Test Test Definition 1.0
          name: cts-cts-media-test-run
        run:
          steps:
          - adb wait-for-device
          - adb devices
          - adb root
          - adb wait-for-device
          - adb devices
          - lava-wait "server_ready"
          - sleep 3
          - lava-test-case "Case1" --shell adb shell /data/local/iperf -c 10.191.253.21 -t 10
    docker:
      image: terceiro/android-platform-tools
    timeout:
      minutes: 4200

The job log for dragonboard-410c is:
+ lava-wait server_ready
<LAVA_WAIT_DEBUG  preparing Wed Jun  8 10:07:22 CST 2021>
<LAVA_WAIT_DEBUG  started Wed Jun  8 10:07:22 CST 2021>
<LAVA_MULTI_NODE> <LAVA_WAIT server_ready>
<LAVA_WAIT_DEBUG  finished Wed Jun  8 10:07:22 CST 2021>
<LAVA_WAIT_DEBUG  finished Wed Jun  8 10:07:22 CST 2021>
<LAVA_WAIT_DEBUG  starting to wait Wed Jun  8 10:07:22 CST 2021>
NOTE: it looks hung at this step, the job can't continue.

The job log for ssh is:
+ lava-send server_ready
<LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun  8 10:07:53 CST 2021>
<LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun  8 10:07:53 CST 2021>
<LAVA_MULTI_NODE> <LAVA_SEND server_ready>
Received Multi_Node API <LAVA_SEND>
messageID: SEND-server_ready
lava-multinode lava-send
Handling signal <LAVA_SEND {"request": "lava_send", "messageID": "server_ready", "message": {}, "timeout": 300}>
Setting poll timeout of 300 seconds
requesting lava_send server_ready
message: {}
requesting lava_send server_ready with args {}
request_send server_ready {}
Sending {'request': 'lava_send', 'messageID': 'server_ready', 'message': {}}
final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", "request": "lava_send", "messageID": "server_ready", "message": {}}
Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
case: multinode-send-server_ready
case_id: 39177
definition: 0_smoke-case
result: pass
<LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun  8 10:07:53 CST 2021>
<LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun  8 10:07:53 CST 2021>
+ iperf -s -V -P 1
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------

It seems that one node can't receive server-ready from another node, what's wrong with my job? Please help!

Thanks,
Hedy Lamarr

_______________________________________________
Lava-users mailing list
Lava-users@lists.lavasoftware.org
https://lists.lavasoftware.org/mailman/listinfo/lava-users


--
Rémi Duraffort
TuxArchitect
Linaro


--
Rémi Duraffort
TuxArchitect
Linaro


--
Rémi Duraffort
TuxArchitect
Linaro


--
Rémi Duraffort
TuxArchitect
Linaro


--
Rémi Duraffort
TuxArchitect
Linaro