On 05. 10. 21 7:31, Manish DUBEY wrote:
> Hello Team,
>
> Hope you are doing well.
>
> We have some users in ST environment being subscribed to your Mailing list. After few days they use to get an email stating that "Your membership in the mailing list has been disabled due to excessive bounces"..
>
> We have verified at our end and didn't find any bounce back email getting generated. To investigate the case, could you please share couple of sample bounce back (NDR) email received by you from ST environment. The NDR may help us to investigate the cause of bounce back email getting generated.
>
> If possible, we can schedule a call as well to discuss the case. Feel free to share your availability to discuss the case and we can share a Teams invite accordingly.
> Awaiting your response.
It seems like a DMARC verification issue on your side:
Oct 5 07:29:12 alsa0 postfix/smtp[23459]: AFEC9839:
to=<arnaud.pouliquen(a)foss.st.com>,
relay=mxa-00178001.gslb.pphosted.com[185.132.182.106]:25, delay=51,
delays=50/0.19/0.48/0.33, dsn=5.7.0, status=bounced (host
mxa-00178001.gslb.pphosted.com[185.132.182.106] said: 550 5.7.0 Local Policy
Violation - DMARC Reject (in reply to end of DATA command))
Jaroslav
--
Jaroslav Kysela <perex(a)perex.cz>
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.
Currently the GPIO Aggregator does not support interrupts. This means
that kernel drivers going from a GPIO to an IRQ using gpiod_to_irq(),
and userspace applications using line events do not work.
Add interrupt support by providing a gpio_chip.to_irq() callback, which
just calls into the parent GPIO controller.
Note that this does not implement full interrupt controller (irq_chip)
support, so using e.g. gpio-keys with "interrupts" instead of "gpios"
still does not work.
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
---
I would prefer to avoid implementing irq_chip support, until there is a
real use case for this.
This has been tested with gpio-keys and gpiomon on the Koelsch
development board:
- gpio-keys, using a DT overlay[1]:
$ overlay add r8a7791-koelsch-keyboard-controlled-led
$ echo gpio-aggregator > /sys/devices/platform/frobnicator/driver_override
$ echo frobnicator > /sys/bus/platform/drivers/gpio-aggregator/bind
$ gpioinfo frobnicator
gpiochip12 - 3 lines:
line 0: "light" "light" output active-high [used]
line 1: "on" "On" input active-low [used]
line 2: "off" "Off" input active-low [used]
$ echo 255 > /sys/class/leds/light/brightness
$ echo 0 > /sys/class/leds/light/brightness
$ evtest /dev/input/event0
- gpiomon, using the GPIO sysfs API:
$ echo keyboard > /sys/bus/platform/drivers/gpio-keys/unbind
$ echo e6055800.gpio 2,6 > /sys/bus/platform/drivers/gpio-aggregator/new_device
$ gpiomon gpiochip12 0 1
[1] "ARM: dts: koelsch: Add overlay for keyboard-controlled LED"
https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git/c…
---
drivers/gpio/gpio-aggregator.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-aggregator.c b/drivers/gpio/gpio-aggregator.c
index 34e35b64dcdc0581..2ff867d2a3630d3b 100644
--- a/drivers/gpio/gpio-aggregator.c
+++ b/drivers/gpio/gpio-aggregator.c
@@ -374,6 +374,13 @@ static int gpio_fwd_set_config(struct gpio_chip *chip, unsigned int offset,
return gpiod_set_config(fwd->descs[offset], config);
}
+static int gpio_fwd_to_irq(struct gpio_chip *chip, unsigned int offset)
+{
+ struct gpiochip_fwd *fwd = gpiochip_get_data(chip);
+
+ return gpiod_to_irq(fwd->descs[offset]);
+}
+
/**
* gpiochip_fwd_create() - Create a new GPIO forwarder
* @dev: Parent device pointer
@@ -414,7 +421,8 @@ static struct gpiochip_fwd *gpiochip_fwd_create(struct device *dev,
for (i = 0; i < ngpios; i++) {
struct gpio_chip *parent = gpiod_to_chip(descs[i]);
- dev_dbg(dev, "%u => gpio-%d\n", i, desc_to_gpio(descs[i]));
+ dev_dbg(dev, "%u => gpio %d irq %d\n", i,
+ desc_to_gpio(descs[i]), gpiod_to_irq(descs[i]));
if (gpiod_cansleep(descs[i]))
chip->can_sleep = true;
@@ -432,6 +440,7 @@ static struct gpiochip_fwd *gpiochip_fwd_create(struct device *dev,
chip->get_multiple = gpio_fwd_get_multiple_locked;
chip->set = gpio_fwd_set;
chip->set_multiple = gpio_fwd_set_multiple_locked;
+ chip->to_irq = gpio_fwd_to_irq;
chip->base = -1;
chip->ngpio = ngpios;
fwd->descs = descs;
--
2.25.1
Hi,
One of the goals of Project Stratos is to enable hypervisor agnostic
backends so we can enable as much re-use of code as possible and avoid
repeating ourselves. This is the flip side of the front end where
multiple front-end implementations are required - one per OS, assuming
you don't just want Linux guests. The resultant guests are trivially
movable between hypervisors modulo any abstracted paravirt type
interfaces.
In my original thumb nail sketch of a solution I envisioned vhost-user
daemons running in a broadly POSIX like environment. The interface to
the daemon is fairly simple requiring only some mapped memory and some
sort of signalling for events (on Linux this is eventfd). The idea was a
stub binary would be responsible for any hypervisor specific setup and
then launch a common binary to deal with the actual virtqueue requests
themselves.
Since that original sketch we've seen an expansion in the sort of ways
backends could be created. There is interest in encapsulating backends
in RTOSes or unikernels for solutions like SCMI. There interest in Rust
has prompted ideas of using the trait interface to abstract differences
away as well as the idea of bare-metal Rust backends.
We have a card (STR-12) called "Hypercall Standardisation" which
calls for a description of the APIs needed from the hypervisor side to
support VirtIO guests and their backends. However we are some way off
from that at the moment as I think we need to at least demonstrate one
portable backend before we start codifying requirements. To that end I
want to think about what we need for a backend to function.
Configuration
=============
In the type-2 setup this is typically fairly simple because the host
system can orchestrate the various modules that make up the complete
system. In the type-1 case (or even type-2 with delegated service VMs)
we need some sort of mechanism to inform the backend VM about key
details about the system:
- where virt queue memory is in it's address space
- how it's going to receive (interrupt) and trigger (kick) events
- what (if any) resources the backend needs to connect to
Obviously you can elide over configuration issues by having static
configurations and baking the assumptions into your guest images however
this isn't scalable in the long term. The obvious solution seems to be
extending a subset of Device Tree data to user space but perhaps there
are other approaches?
Before any virtio transactions can take place the appropriate memory
mappings need to be made between the FE guest and the BE guest.
Currently the whole of the FE guests address space needs to be visible
to whatever is serving the virtio requests. I can envision 3 approaches:
* BE guest boots with memory already mapped
This would entail the guest OS knowing where in it's Guest Physical
Address space is already taken up and avoiding clashing. I would assume
in this case you would want a standard interface to userspace to then
make that address space visible to the backend daemon.
* BE guests boots with a hypervisor handle to memory
The BE guest is then free to map the FE's memory to where it wants in
the BE's guest physical address space. To activate the mapping will
require some sort of hypercall to the hypervisor. I can see two options
at this point:
- expose the handle to userspace for daemon/helper to trigger the
mapping via existing hypercall interfaces. If using a helper you
would have a hypervisor specific one to avoid the daemon having to
care too much about the details or push that complexity into a
compile time option for the daemon which would result in different
binaries although a common source base.
- expose a new kernel ABI to abstract the hypercall differences away
in the guest kernel. In this case the userspace would essentially
ask for an abstract "map guest N memory to userspace ptr" and let
the kernel deal with the different hypercall interfaces. This of
course assumes the majority of BE guests would be Linux kernels and
leaves the bare-metal/unikernel approaches to their own devices.
Operation
=========
The core of the operation of VirtIO is fairly simple. Once the
vhost-user feature negotiation is done it's a case of receiving update
events and parsing the resultant virt queue for data. The vhost-user
specification handles a bunch of setup before that point, mostly to
detail where the virt queues are set up FD's for memory and event
communication. This is where the envisioned stub process would be
responsible for getting the daemon up and ready to run. This is
currently done inside a big VMM like QEMU but I suspect a modern
approach would be to use the rust-vmm vhost crate. It would then either
communicate with the kernel's abstracted ABI or be re-targeted as a
build option for the various hypervisors.
One question is how to best handle notification and kicks. The existing
vhost-user framework uses eventfd to signal the daemon (although QEMU
is quite capable of simulating them when you use TCG). Xen has it's own
IOREQ mechanism. However latency is an important factor and having
events go through the stub would add quite a lot.
Could we consider the kernel internally converting IOREQ messages from
the Xen hypervisor to eventfd events? Would this scale with other kernel
hypercall interfaces?
So any thoughts on what directions are worth experimenting with?
--
Alex Bennée
The I2C protocol allows zero-length requests with no data, like the
SMBus Quick command, where the command is inferred based on the
read/write flag itself.
In order to allow such a request, allocate another bit,
VIRTIO_I2C_FLAGS_M_RD(1), in the flags to pass the request type, as read
or write. This was earlier done using the read/write permission to the
buffer itself.
This still won't work well if multiple buffers are passed for the same
request, i.e. the write-read requests, as the VIRTIO_I2C_FLAGS_M_RD flag
can only be used with a single buffer.
Coming back to it, there is no need to send multiple buffers with a
single request. All we need, is a way to group several requests
together, which we can already do based on the
VIRTIO_I2C_FLAGS_FAIL_NEXT flag.
Remove support for multiple buffers within a single request.
Since we are at very early stage of development currently, we can do
these modifications without addition of new features or versioning of
the protocol.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
V2->V3:
- Add conformance clauses that require that the flag is consistent with the
buffer.
V1->V2:
- Name the buffer-less request as zero-length request.
Hi Guys,
I did try to follow the discussion you guys had during V4, where we added
support for multiple buffers for the same request, which I think is unnecessary
now, after introduction of the VIRTIO_I2C_FLAGS_FAIL_NEXT flag.
https://lists.oasis-open.org/archives/virtio-comment/202011/msg00005.html
And so starting this discussion again, because we need to support stuff
like: i2cdetect -q <i2c-bus-number>, which issues a zero-length SMBus
Quick command.
---
virtio-i2c.tex | 66 +++++++++++++++++++++++++++-----------------------
1 file changed, 36 insertions(+), 30 deletions(-)
diff --git a/virtio-i2c.tex b/virtio-i2c.tex
index 949d75f44158..c7335372a8bb 100644
--- a/virtio-i2c.tex
+++ b/virtio-i2c.tex
@@ -54,8 +54,7 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada
\begin{lstlisting}
struct virtio_i2c_req {
struct virtio_i2c_out_hdr out_hdr;
- u8 write_buf[];
- u8 read_buf[];
+ u8 buf[];
struct virtio_i2c_in_hdr in_hdr;
};
\end{lstlisting}
@@ -84,16 +83,16 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada
and sets it on the other requests. If this bit is set and a device fails
to process the current request, it needs to fail the next request instead
of attempting to execute it.
+
+\item[VIRTIO_I2C_FLAGS_M_RD(1)] is used to mark the request as READ or WRITE.
\end{description}
Other bits of \field{flags} are currently reserved as zero for future feature
extensibility.
-The \field{write_buf} of the request contains one segment of an I2C transaction
-being written to the device.
-
-The \field{read_buf} of the request contains one segment of an I2C transaction
-being read from the device.
+The \field{buf} of the request is optional and contains one segment of an I2C
+transaction being read from or written to the device, based on the value of the
+\field{VIRTIO_I2C_FLAGS_M_RD} bit in the \field{flags} field.
The final \field{status} byte of the request is written by the device: either
VIRTIO_I2C_MSG_OK for success or VIRTIO_I2C_MSG_ERR for error.
@@ -103,27 +102,27 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada
#define VIRTIO_I2C_MSG_ERR 1
\end{lstlisting}
-If ``length of \field{read_buf}''=0 and ``length of \field{write_buf}''>0,
-the request is called write request.
+If \field{VIRTIO_I2C_FLAGS_M_RD} bit is set in the \field{flags}, then the
+request is called a read request.
-If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''=0,
-the request is called read request.
+If \field{VIRTIO_I2C_FLAGS_M_RD} bit is unset in the \field{flags}, then the
+request is called a write request.
-If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''>0,
-the request is called write-read request. It means an I2C write segment followed
-by a read segment. Usually, the write segment provides the number of an I2C
-controlled device register to be read.
+The \field{buf} is optional and will not be present for a zero-length request,
+like SMBus Quick.
-The case when ``length of \field{write_buf}''=0, and at the same time,
-``length of \field{read_buf}''=0 doesn't make any sense.
+The virtio I2C protocol supports write-read requests, i.e. an I2C write segment
+followed by a read segment (usually, the write segment provides the number of an
+I2C controlled device register to be read), by grouping a list of requests
+together using the \field{VIRTIO_I2C_FLAGS_FAIL_NEXT} flag.
\subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C Adapter Device / Device Operation: Operation Status}
-\field{addr}, \field{flags}, ``length of \field{write_buf}'' and ``length of \field{read_buf}''
-are determined by the driver, while \field{status} is determined by the processing
-of the device. A driver puts the data written to the device into \field{write_buf}, while
-a device puts the data of the corresponding length into \field{read_buf} according to the
-request of the driver.
+\field{addr}, \field{flags}, and ``length of \field{buf}'' are determined by the
+driver, while \field{status} is determined by the processing of the device. A
+driver, for a write request, puts the data to be written to the device into the
+\field{buf}, while a device, for a read request, puts the data read from device
+into the \field{buf} according to the request from the driver.
A driver may send one request or multiple requests to the device at a time.
The requests in the virtqueue are both queued and processed in order.
@@ -141,11 +140,16 @@ \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C
A driver MUST set the reserved bits of \field{flags} to be zero.
-The driver MUST NOT send a request with ``length of \field{write_buf}''=0 and
-``length of \field{read_buf}''=0 at the same time.
+A driver MUST NOT send the \field{buf}, for a zero-length request.
+
+A driver MUST NOT use \field{buf}, for a read request, if the final
+\field{status} returned from the device is VIRTIO_I2C_MSG_ERR.
-A driver MUST NOT use \field{read_buf} if the final \field{status} returned
-from the device is VIRTIO_I2C_MSG_ERR.
+A driver MUST set the \field{VIRTIO_I2C_FLAGS_M_RD} flag for a read operation,
+where the buffer is write-only for the device.
+
+A driver MUST NOT set the \field{VIRTIO_I2C_FLAGS_M_RD} flag for a write
+operation, where the buffer is read-only for the device.
A driver MUST queue the requests in order if multiple requests are going to
be sent at a time.
@@ -160,11 +164,13 @@ \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C
A device SHOULD keep consistent behaviors with the hardware as described in
\hyperref[intro:I2C]{I2C}.
-A device MUST NOT change the value of \field{addr}, reserved bits of \field{flags}
-and \field{write_buf}.
+A device MUST NOT change the value of \field{addr}, and reserved bits of
+\field{flags}.
+
+A device MUST not change the value of the \field{buf} for a write request.
-A device MUST place one I2C segment of the corresponding length into \field{read_buf}
-according the driver's request.
+A device MUST place one I2C segment of the ``length of \field{buf}'', for the
+read request, into the \field{buf} according the driver's request.
A device MUST guarantee the requests in the virtqueue being processed in order
if multiple requests are received at a time.
--
2.31.1.272.g89b43f80a514
Hi,
As we consider the next cycle for Project Stratos I would like to make
some more progress on hypervisor agnosticism for our virtio backends.
While we have implemented a number of virtio vhost-user backends using C
we've rapidly switched to using rust-vmm based ones for virtio-i2c,
virtio-rng and virtio-gpio. Given the interest in Rust for implementing
backends does it make sense to do some enabling work in rust-vmm to
support Xen?
There are two chunks of work I can think of:
1. Enough of libxl/hypervisor interface to implement an IOREQ end point.
This would require supporting enough of the hypervisor interface to
support the implementation of an IOREQ server. We would also need to
think about how we would map the IOREQ view of the world into the
existing vhost-user interface so we can re-use the current vhost-user
backends code base. The two approaches I can think of are:
a) implement a vhost-user master that speaks IOREQ to the hypervisor
and vhost-user to the vhost-user slave. In this case the bridge
would be standing in for something like QEMU.
b) implement some variants of the vhost-user slave traits that can
talk directly to the hypervisor to get/send the equivalent
kick/notify events. I don't know if this might be too complex as the
impedance matching between the two interfaces might be too great.
This assumes most of the setup is done by the existing toolstack, so
the existing libxl tools are used to create, connect and configure the
domains before the backend is launched.
which leads to:
2. The rest of the libxl/hypervisor interface.
This would be the rest of the interface to allow rust-vmm tools to be
written that could create, configure and manage Xen domains with pure
rust tools. My main concern about this is how rust-vmm's current model
(which is very much KVM influenced) will be able to handle the
differences for a type-1 hypervisor. Wei's pointed me to the Linux
support that was added to expose a Hyper-V control interface via the
Linux kernel. While I can see support has been merged on other rust
based projects I think the rust-vmm crate is still outstanding:
https://github.com/rust-vmm/community/issues/50
and I guess this would need revisiting for Xen to see if the proposed
abstraction would scale across other hypervisors.
Finally there is the question of how/if any of this would relate to the
concept of bare-metal rust backends? We've talked about bare metal
backends before but I wonder if the programming model for them is going
to be outside the scope of rust-vmm? Would be program just be hardwired
to IRQs and be presented a doorbell port to kick or would we want to
have at least some of the higher level rust-vmm abstractions for dealing
with navigating the virtqueues and responding and filling in data?
Thoughts?
--
Alex Bennée
Hello,
This patchset adds vhost-user-i2c device's support in Qemu. Initially I tried to
add the backend implementation as well into Qemu, but as I was looking for a
hypervisor agnostic backend implementation, I decided to keep it outside of
Qemu. Eventually I implemented it in Rust and it works very well with this
patchset, and it is under review [1] to be merged in common rust vhost devices
crate.
The kernel virtio I2C driver [2] is fully reviewed and is ready to be merged soon.
V1->V2:
- Dropped the backend support from qemu and minor cleanups.
I2C Testing:
------------
I didn't have access to a real hardware where I can play with a I2C
client device (like RTC, eeprom, etc) to verify the working of the
backend daemon, so I decided to test it on my x86 box itself with
hierarchy of two ARM64 guests.
The first ARM64 guest was passed "-device ds1338,address=0x20" option,
so it could emulate a ds1338 RTC device, which connects to an I2C bus.
Once the guest came up, ds1338 device instance was created within the
guest kernel by doing:
echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device
[
Note that this may end up binding the ds1338 device to its driver,
which won't let our i2c daemon talk to the device. For that we need to
manually unbind the device from the driver:
echo 0-0020 > /sys/bus/i2c/devices/0-0020/driver/unbind
]
After this is done, you will get /dev/rtc1. This is the device we wanted
to emulate, which will be accessed by the vhost-user-i2c backend daemon
via the /dev/i2c-0 file present in the guest VM.
At this point we need to start the backend daemon and give it a
socket-path to talk to from qemu (you can pass -v to it to get more
detailed messages):
vhost-user-i2c --socket-path=vi2c.sock -l 0:32
[ Here, 0:32 is the bus/device mapping, 0 for /dev/i2c-0 and 32 (i.e.
0x20) is client address of ds1338 that we used while creating the
device. ]
Now we need to start the second level ARM64 guest (from within the first
guest) to get the i2c-virtio.c Linux driver up. The second level guest
is passed the following options to connect to the same socket:
-chardev socket,path=vi2c.sock0,id=vi2c \
-device vhost-user-i2c-pci,chardev=vi2c,id=i2c
Once the second level guest boots up, we will see the i2c-virtio bus at
/sys/bus/i2c/devices/i2c-X/. From there we can now make it emulate the
ds1338 device again by doing:
echo ds1338 0x20 > /sys/bus/i2c/devices/i2c-0/new_device
[ This time we want ds1338's driver to be bound to the device, so it
should be enabled in the kernel as well. ]
And we will get /dev/rtc1 device again here in the second level guest.
Now we can play with the rtc device with help of hwclock utility and we
can see the following sequence of transfers happening if we try to
update rtc's time from system time.
hwclock -w -f /dev/rtc1 (in guest2) ->
Reaches i2c-virtio.c (Linux bus driver in guest2) ->
transfer over virtio ->
Reaches the qemu's vhost-i2c device emulation (running over guest1) ->
Reaches the backend daemon vhost-user-i2c started earlier (in guest1) ->
ioctl(/dev/i2c-0, I2C_RDWR, ..); (in guest1) ->
reaches qemu's hw/rtc/ds1338.c (running over host)
SMBUS Testing:
--------------
I wasn't required to have such a tedious setup for testing out with
SMBUS devices. I was able to emulate a SMBUS device on my x86 machine
using i2c-stub driver.
$ modprobe i2c-stub chip_addr=0x20
//Boot the arm64 guest now with i2c-virtio driver and then do:
$ echo al3320a 0x20 > /sys/class/i2c-adapter/i2c-0/new_device
$ cat /sys/bus/iio/devices/iio:device0/in_illuminance_raw
That's it.
I hope I was able to give a clear picture of my test setup here :)
--
Viresh
Viresh Kumar (3):
hw/virtio: add boilerplate for vhost-user-i2c device
hw/virtio: add vhost-user-i2c-pci boilerplate
MAINTAINERS: Add entry for virtio-i2c
MAINTAINERS | 7 +
hw/virtio/Kconfig | 5 +
hw/virtio/meson.build | 2 +
hw/virtio/vhost-user-i2c-pci.c | 69 +++++++
hw/virtio/vhost-user-i2c.c | 288 +++++++++++++++++++++++++++++
include/hw/virtio/vhost-user-i2c.h | 28 +++
6 files changed, 399 insertions(+)
create mode 100644 hw/virtio/vhost-user-i2c-pci.c
create mode 100644 hw/virtio/vhost-user-i2c.c
create mode 100644 include/hw/virtio/vhost-user-i2c.h
--
2.31.1.272.g89b43f80a514
Hi
I believe there is a hidden problem in the virtio implementation of Qemu
(up to 6.1.0) in calculating the offset of the "used" split vring and the
spec need some clarifications. Should anyone decide it deserve
upstream/spec changes, feel free to do so.
Cheers
FF
According to the specification in 2.6.2:
#define ALIGN(x) (((x) + qalign) & ~qalign)
static inline unsigned virtq_size(unsigned int qsz)
{
return ALIGN(sizeof(struct virtq_desc)*qsz + sizeof(u16)*(*3* + qsz))
}
And more specifically, "used" starts after "avail" as defined in 2.6.6
struct virtq_avail {
#define VIRTQ_AVAIL_F_NO_INTERRUPT 1
le16 flags;
le16 idx;
le16 ring[ /* Queue Size */ ];
*le16 used_event; /* Only if VIRTIO_F_EVENT_IDX */*
};
Linux and kvmtool calculates the offset with the formula:
LINUX: vring_init @ include/uapi/linux/virtio_ring.h
vr->avail = (struct vring_avail *)((char *)p + num * sizeof(struct
vring_desc));
vr->used = (void *)(((uintptr_t)&vr->avail->ring[num] *+ sizeof(__virtio16)*
+ align-1) & ~(align - 1));
The "+ sizeof(__virtio16)" properly accounts for the "used_event" in struct
virtue_avail.
Hypervisor ACRN uses a similar scheme: virtio_vq_init @
/devicemodel/hw/pci/virtio/virtio.c
vq->avail = (struct vring_avail *)vb;
vb += (2 + vq->qsize *+ 1*) * sizeof(uint16_t);
/* Then it's rounded up to the next page... */
vb = (char *)roundup2((uintptr_t)vb, VIRTIO_PCI_VRING_ALIGN);
/* ... and the last page(s) are the used ring. */
vq->used = (struct vring_used *)vb;
But Qemu uses: QEMU: virtio_queue_update_rings @ hw/virtio/virtio.c
vring->used = vring_align(vring->avail +
offsetof(VRingAvail, ring[vring->num]),
vring->align);
Linux alignment policies end up having vring->align values either 4096 (for
MMIO) or 64 (PCI), and thus there are no visible issues.
If you use a different OS that choses an alignment of 4 (valid as per
section 2.6) then Qemu does not calculate the same location for used and
virtio does not work anymore. The OS actually works fine with the alignment
of 4 with kvmtool and ACRN.
There are two other problems:
on the spec, the comment "/* Only if VIRTIO_F_EVENT_IDX */" on the avail
structure is not clear wether if:
- the field is always there but its content are only valid if...
- the field may be absent altogether.
Inferring from calculation formulae the field is always present, but some
language would help clarifying this.
On 2.6.2, the alignment formula "#define ALIGN(x) (((x) + qalign) &
~qalign)" is true if "qalign" is actually a mask as in many parts of the
spec, align is referred to as a power of 2. It may be good to change the
text with something like: #define ALIGN(x) (((x) + (qalign - 1)) & ~(qalign
-1)) /* where qalign is a power of Z */
--
François-Frédéric Ozog | *Director Business Development*
T: +33.67221.6485
francois.ozog(a)linaro.org | Skype: ffozog
Hi
I was asked by AGL Virtualization AG if there was any implementation of the
virtio-video driver: could anyone let me know what I should answer?
Cheers
FF
--
François-Frédéric Ozog | *Director Business Development*
T: +33.67221.6485
francois.ozog(a)linaro.org | Skype: ffozog
I top post as I find it difficult to identify where to make the comments.
1) BE acceleration
Network and storage backends may actually be executed in SmartNICs. As
virtio 1.1 is hardware friendly, there may be SmartNICs with virtio 1.1 PCI
VFs. Is it a valid use case for the generic BE framework to be used in this
context?
DPDK is used in some BE to significantly accelerate switching. DPDK is also
used sometimes in guests. In that case, there are no event injection but
just high performance memory scheme. Is this considered as a use case?
2) Virtio as OS HAL
Panasonic CTO has been calling for a virtio based HAL and based on the
teachings of Google GKI, an internal HAL seem inevitable in the long term.
Virtio is then a contender to Google promoted Android HAL. Could the
framework be used in that context?
On Wed, 11 Aug 2021 at 08:28, AKASHI Takahiro via Stratos-dev <
stratos-dev(a)op-lists.linaro.org> wrote:
> On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > CCing people working on Xen+VirtIO and IOREQs. Not trimming the original
> > email to let them read the full context.
> >
> > My comments below are related to a potential Xen implementation, not
> > because it is the only implementation that matters, but because it is
> > the one I know best.
>
> Please note that my proposal (and hence the working prototype)[1]
> is based on Xen's virtio implementation (i.e. IOREQ) and particularly
> EPAM's virtio-disk application (backend server).
> It has been, I believe, well generalized but is still a bit biased
> toward this original design.
>
> So I hope you like my approach :)
>
> [1]
> https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000546.html
>
> Let me take this opportunity to explain a bit more about my approach below.
>
> > Also, please see this relevant email thread:
> > https://marc.info/?l=xen-devel&m=162373754705233&w=2
> >
> >
> > On Wed, 4 Aug 2021, Alex Bennée wrote:
> > > Hi,
> > >
> > > One of the goals of Project Stratos is to enable hypervisor agnostic
> > > backends so we can enable as much re-use of code as possible and avoid
> > > repeating ourselves. This is the flip side of the front end where
> > > multiple front-end implementations are required - one per OS, assuming
> > > you don't just want Linux guests. The resultant guests are trivially
> > > movable between hypervisors modulo any abstracted paravirt type
> > > interfaces.
> > >
> > > In my original thumb nail sketch of a solution I envisioned vhost-user
> > > daemons running in a broadly POSIX like environment. The interface to
> > > the daemon is fairly simple requiring only some mapped memory and some
> > > sort of signalling for events (on Linux this is eventfd). The idea was
> a
> > > stub binary would be responsible for any hypervisor specific setup and
> > > then launch a common binary to deal with the actual virtqueue requests
> > > themselves.
> > >
> > > Since that original sketch we've seen an expansion in the sort of ways
> > > backends could be created. There is interest in encapsulating backends
> > > in RTOSes or unikernels for solutions like SCMI. There interest in Rust
> > > has prompted ideas of using the trait interface to abstract differences
> > > away as well as the idea of bare-metal Rust backends.
> > >
> > > We have a card (STR-12) called "Hypercall Standardisation" which
> > > calls for a description of the APIs needed from the hypervisor side to
> > > support VirtIO guests and their backends. However we are some way off
> > > from that at the moment as I think we need to at least demonstrate one
> > > portable backend before we start codifying requirements. To that end I
> > > want to think about what we need for a backend to function.
> > >
> > > Configuration
> > > =============
> > >
> > > In the type-2 setup this is typically fairly simple because the host
> > > system can orchestrate the various modules that make up the complete
> > > system. In the type-1 case (or even type-2 with delegated service VMs)
> > > we need some sort of mechanism to inform the backend VM about key
> > > details about the system:
> > >
> > > - where virt queue memory is in it's address space
> > > - how it's going to receive (interrupt) and trigger (kick) events
> > > - what (if any) resources the backend needs to connect to
> > >
> > > Obviously you can elide over configuration issues by having static
> > > configurations and baking the assumptions into your guest images
> however
> > > this isn't scalable in the long term. The obvious solution seems to be
> > > extending a subset of Device Tree data to user space but perhaps there
> > > are other approaches?
> > >
> > > Before any virtio transactions can take place the appropriate memory
> > > mappings need to be made between the FE guest and the BE guest.
> >
> > > Currently the whole of the FE guests address space needs to be visible
> > > to whatever is serving the virtio requests. I can envision 3
> approaches:
> > >
> > > * BE guest boots with memory already mapped
> > >
> > > This would entail the guest OS knowing where in it's Guest Physical
> > > Address space is already taken up and avoiding clashing. I would
> assume
> > > in this case you would want a standard interface to userspace to then
> > > make that address space visible to the backend daemon.
>
> Yet another way here is that we would have well known "shared memory"
> between
> VMs. I think that Jailhouse's ivshmem gives us good insights on this matter
> and that it can even be an alternative for hypervisor-agnostic solution.
>
> (Please note memory regions in ivshmem appear as a PCI device and can be
> mapped locally.)
>
> I want to add this shared memory aspect to my virtio-proxy, but
> the resultant solution would eventually look similar to ivshmem.
>
> > > * BE guests boots with a hypervisor handle to memory
> > >
> > > The BE guest is then free to map the FE's memory to where it wants in
> > > the BE's guest physical address space.
> >
> > I cannot see how this could work for Xen. There is no "handle" to give
> > to the backend if the backend is not running in dom0. So for Xen I think
> > the memory has to be already mapped
>
> In Xen's IOREQ solution (virtio-blk), the following information is expected
> to be exposed to BE via Xenstore:
> (I know that this is a tentative approach though.)
> - the start address of configuration space
> - interrupt number
> - file path for backing storage
> - read-only flag
> And the BE server have to call a particular hypervisor interface to
> map the configuration space.
>
> In my approach (virtio-proxy), all those Xen (or hypervisor)-specific
> stuffs are contained in virtio-proxy, yet another VM, to hide all details.
>
> # My point is that a "handle" is not mandatory for executing mapping.
>
> > and the mapping probably done by the
> > toolstack (also see below.) Or we would have to invent a new Xen
> > hypervisor interface and Xen virtual machine privileges to allow this
> > kind of mapping.
>
> > If we run the backend in Dom0 that we have no problems of course.
>
> One of difficulties on Xen that I found in my approach is that calling
> such hypervisor intefaces (registering IOREQ, mapping memory) is only
> allowed on BE servers themselvies and so we will have to extend those
> interfaces.
> This, however, will raise some concern on security and privilege
> distribution
> as Stefan suggested.
> >
> >
> > > To activate the mapping will
> > > require some sort of hypercall to the hypervisor. I can see two
> options
> > > at this point:
> > >
> > > - expose the handle to userspace for daemon/helper to trigger the
> > > mapping via existing hypercall interfaces. If using a helper you
> > > would have a hypervisor specific one to avoid the daemon having to
> > > care too much about the details or push that complexity into a
> > > compile time option for the daemon which would result in different
> > > binaries although a common source base.
> > >
> > > - expose a new kernel ABI to abstract the hypercall differences away
> > > in the guest kernel. In this case the userspace would essentially
> > > ask for an abstract "map guest N memory to userspace ptr" and let
> > > the kernel deal with the different hypercall interfaces. This of
> > > course assumes the majority of BE guests would be Linux kernels and
> > > leaves the bare-metal/unikernel approaches to their own devices.
> > >
> > > Operation
> > > =========
> > >
> > > The core of the operation of VirtIO is fairly simple. Once the
> > > vhost-user feature negotiation is done it's a case of receiving update
> > > events and parsing the resultant virt queue for data. The vhost-user
> > > specification handles a bunch of setup before that point, mostly to
> > > detail where the virt queues are set up FD's for memory and event
> > > communication. This is where the envisioned stub process would be
> > > responsible for getting the daemon up and ready to run. This is
> > > currently done inside a big VMM like QEMU but I suspect a modern
> > > approach would be to use the rust-vmm vhost crate. It would then either
> > > communicate with the kernel's abstracted ABI or be re-targeted as a
> > > build option for the various hypervisors.
> >
> > One thing I mentioned before to Alex is that Xen doesn't have VMMs the
> > way they are typically envisioned and described in other environments.
> > Instead, Xen has IOREQ servers. Each of them connects independently to
> > Xen via the IOREQ interface. E.g. today multiple QEMUs could be used as
> > emulators for a single Xen VM, each of them connecting to Xen
> > independently via the IOREQ interface.
> >
> > The component responsible for starting a daemon and/or setting up shared
> > interfaces is the toolstack: the xl command and the libxl/libxc
> > libraries.
>
> I think that VM configuration management (or orchestration in Startos
> jargon?) is a subject to debate in parallel.
> Otherwise, is there any good assumption to avoid it right now?
>
> > Oleksandr and others I CCed have been working on ways for the toolstack
> > to create virtio backends and setup memory mappings. They might be able
> > to provide more info on the subject. I do think we miss a way to provide
> > the configuration to the backend and anything else that the backend
> > might require to start doing its job.
> >
> >
> > > One question is how to best handle notification and kicks. The existing
> > > vhost-user framework uses eventfd to signal the daemon (although QEMU
> > > is quite capable of simulating them when you use TCG). Xen has it's own
> > > IOREQ mechanism. However latency is an important factor and having
> > > events go through the stub would add quite a lot.
> >
> > Yeah I think, regardless of anything else, we want the backends to
> > connect directly to the Xen hypervisor.
>
> In my approach,
> a) BE -> FE: interrupts triggered by BE calling a hypervisor interface
> via virtio-proxy
> b) FE -> BE: MMIO to config raises events (in event channels), which is
> converted to a callback to BE via virtio-proxy
> (Xen's event channel is internnally implemented by
> interrupts.)
>
> I don't know what "connect directly" means here, but sending interrupts
> to the opposite side would be best efficient.
> Ivshmem, I suppose, takes this approach by utilizing PCI's msi-x mechanism.
>
> >
> > > Could we consider the kernel internally converting IOREQ messages from
> > > the Xen hypervisor to eventfd events? Would this scale with other
> kernel
> > > hypercall interfaces?
> > >
> > > So any thoughts on what directions are worth experimenting with?
> >
> > One option we should consider is for each backend to connect to Xen via
> > the IOREQ interface. We could generalize the IOREQ interface and make it
> > hypervisor agnostic. The interface is really trivial and easy to add.
>
> As I said above, my proposal does the same thing that you mentioned here :)
> The difference is that I do call hypervisor interfaces via virtio-proxy.
>
> > The only Xen-specific part is the notification mechanism, which is an
> > event channel. If we replaced the event channel with something else the
> > interface would be generic. See:
> >
> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/io…
> >
> > I don't think that translating IOREQs to eventfd in the kernel is a
> > good idea: if feels like it would be extra complexity and that the
> > kernel shouldn't be involved as this is a backend-hypervisor interface.
>
> Given that we may want to implement BE as a bare-metal application
> as I did on Zephyr, I don't think that the translation would not be
> a big issue, especially on RTOS's.
> It will be some kind of abstraction layer of interrupt handling
> (or nothing but a callback mechanism).
>
> > Also, eventfd is very Linux-centric and we are trying to design an
> > interface that could work well for RTOSes too. If we want to do
> > something different, both OS-agnostic and hypervisor-agnostic, perhaps
> > we could design a new interface. One that could be implementable in the
> > Xen hypervisor itself (like IOREQ) and of course any other hypervisor
> > too.
> >
> >
> > There is also another problem. IOREQ is probably not be the only
> > interface needed. Have a look at
> > https://marc.info/?l=xen-devel&m=162373754705233&w=2. Don't we also need
> > an interface for the backend to inject interrupts into the frontend? And
> > if the backend requires dynamic memory mappings of frontend pages, then
> > we would also need an interface to map/unmap domU pages.
>
> My proposal document might help here; All the interfaces required for
> virtio-proxy (or hypervisor-related interfaces) are listed as
> RPC protocols :)
>
> > These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
> > and self-contained. It is easy to add anywhere. A new interface to
> > inject interrupts or map pages is more difficult to manage because it
> > would require changes scattered across the various emulators.
>
> Exactly. I have no confident yet that my approach will also apply
> to other hypervisors than Xen.
> Technically, yes, but whether people can accept it or not is a different
> matter.
>
> Thanks,
> -Takahiro Akashi
>
> --
> Stratos-dev mailing list
> Stratos-dev(a)op-lists.linaro.org
> https://op-lists.linaro.org/mailman/listinfo/stratos-dev
>
--
François-Frédéric Ozog | *Director Business Development*
T: +33.67221.6485
francois.ozog(a)linaro.org | Skype: ffozog
The I2C protocol allows zero-length requests with no data, like the
SMBus Quick command, where the command is inferred based on the
read/write flag itself.
In order to allow such a request, allocate another bit,
VIRTIO_I2C_FLAGS_M_RD(1), in the flags to pass the request type, as read
or write. This was earlier done using the read/write permission to the
buffer itself.
This still won't work well if multiple buffers are passed for the same
request, i.e. the write-read requests, as the VIRTIO_I2C_FLAGS_M_RD flag
can only be used with a single buffer.
Coming back to it, there is no need to send multiple buffers with a
single request. All we need, is a way to group several requests
together, which we can already do based on the
VIRTIO_I2C_FLAGS_FAIL_NEXT flag.
Remove support for multiple buffers within a single request.
Since we are at very early stage of development currently, we can do
these modifications without addition of new features or versioning of
the protocol.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
V1->V2:
- Name the buffer-less request as zero-length request.
Hi Guys,
I did try to follow the discussion you guys had during V4, where we added
support for multiple buffers for the same request, which I think is unnecessary
now, after introduction of the VIRTIO_I2C_FLAGS_FAIL_NEXT flag.
https://lists.oasis-open.org/archives/virtio-comment/202011/msg00005.html
And so starting this discussion again, because we need to support stuff
like: i2cdetect -q <i2c-bus-number>, which issues a zero-length SMBus
Quick command.
---
virtio-i2c.tex | 60 +++++++++++++++++++++++++-------------------------
1 file changed, 30 insertions(+), 30 deletions(-)
diff --git a/virtio-i2c.tex b/virtio-i2c.tex
index 949d75f44158..ae344b2bc822 100644
--- a/virtio-i2c.tex
+++ b/virtio-i2c.tex
@@ -54,8 +54,7 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada
\begin{lstlisting}
struct virtio_i2c_req {
struct virtio_i2c_out_hdr out_hdr;
- u8 write_buf[];
- u8 read_buf[];
+ u8 buf[];
struct virtio_i2c_in_hdr in_hdr;
};
\end{lstlisting}
@@ -84,16 +83,16 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada
and sets it on the other requests. If this bit is set and a device fails
to process the current request, it needs to fail the next request instead
of attempting to execute it.
+
+\item[VIRTIO_I2C_FLAGS_M_RD(1)] is used to mark the request as READ or WRITE.
\end{description}
Other bits of \field{flags} are currently reserved as zero for future feature
extensibility.
-The \field{write_buf} of the request contains one segment of an I2C transaction
-being written to the device.
-
-The \field{read_buf} of the request contains one segment of an I2C transaction
-being read from the device.
+The \field{buf} of the request is optional and contains one segment of an I2C
+transaction being read from or written to the device, based on the value of the
+\field{VIRTIO_I2C_FLAGS_M_RD} bit in the \field{flags} field.
The final \field{status} byte of the request is written by the device: either
VIRTIO_I2C_MSG_OK for success or VIRTIO_I2C_MSG_ERR for error.
@@ -103,27 +102,27 @@ \subsubsection{Device Operation: Request Queue}\label{sec:Device Types / I2C Ada
#define VIRTIO_I2C_MSG_ERR 1
\end{lstlisting}
-If ``length of \field{read_buf}''=0 and ``length of \field{write_buf}''>0,
-the request is called write request.
+If \field{VIRTIO_I2C_FLAGS_M_RD} bit is set in the \field{flags}, then the
+request is called a read request.
-If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''=0,
-the request is called read request.
+If \field{VIRTIO_I2C_FLAGS_M_RD} bit is unset in the \field{flags}, then the
+request is called a write request.
-If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''>0,
-the request is called write-read request. It means an I2C write segment followed
-by a read segment. Usually, the write segment provides the number of an I2C
-controlled device register to be read.
+The \field{buf} is optional and will not be present for a zero-length request,
+like SMBus Quick.
-The case when ``length of \field{write_buf}''=0, and at the same time,
-``length of \field{read_buf}''=0 doesn't make any sense.
+The virtio I2C protocol supports write-read requests, i.e. an I2C write segment
+followed by a read segment (usually, the write segment provides the number of an
+I2C controlled device register to be read), by grouping a list of requests
+together using the \field{VIRTIO_I2C_FLAGS_FAIL_NEXT} flag.
\subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C Adapter Device / Device Operation: Operation Status}
-\field{addr}, \field{flags}, ``length of \field{write_buf}'' and ``length of \field{read_buf}''
-are determined by the driver, while \field{status} is determined by the processing
-of the device. A driver puts the data written to the device into \field{write_buf}, while
-a device puts the data of the corresponding length into \field{read_buf} according to the
-request of the driver.
+\field{addr}, \field{flags}, and ``length of \field{buf}'' are determined by the
+driver, while \field{status} is determined by the processing of the device. A
+driver, for a write request, puts the data to be written to the device into the
+\field{buf}, while a device, for a read request, puts the data read from device
+into the \field{buf} according to the request from the driver.
A driver may send one request or multiple requests to the device at a time.
The requests in the virtqueue are both queued and processed in order.
@@ -141,11 +140,10 @@ \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C
A driver MUST set the reserved bits of \field{flags} to be zero.
-The driver MUST NOT send a request with ``length of \field{write_buf}''=0 and
-``length of \field{read_buf}''=0 at the same time.
+A driver MUST NOT send the \field{buf}, for a zero-length request.
-A driver MUST NOT use \field{read_buf} if the final \field{status} returned
-from the device is VIRTIO_I2C_MSG_ERR.
+A driver MUST NOT use \field{buf}, for a read request, if the final
+\field{status} returned from the device is VIRTIO_I2C_MSG_ERR.
A driver MUST queue the requests in order if multiple requests are going to
be sent at a time.
@@ -160,11 +158,13 @@ \subsubsection{Device Operation: Operation Status}\label{sec:Device Types / I2C
A device SHOULD keep consistent behaviors with the hardware as described in
\hyperref[intro:I2C]{I2C}.
-A device MUST NOT change the value of \field{addr}, reserved bits of \field{flags}
-and \field{write_buf}.
+A device MUST NOT change the value of \field{addr}, and reserved bits of
+\field{flags}.
+
+A device MUST not change the value of the \field{buf} for a write request.
-A device MUST place one I2C segment of the corresponding length into \field{read_buf}
-according the driver's request.
+A device MUST place one I2C segment of the ``length of \field{buf}'', for the
+read request, into the \field{buf} according the driver's request.
A device MUST guarantee the requests in the virtqueue being processed in order
if multiple requests are received at a time.
--
2.31.1.272.g89b43f80a514