Hi Jonathan,Lorenzo,all,
Do we have any topic to sync next week?
Thanks:)
Joyce
> 在 2022年9月13日,上午8:00,linaro-open-discussions-request@op-lists.linaro.org 写道:
>
> Send Linaro-open-discussions mailing list submissions to
> linaro-open-discussions(a)op-lists.linaro.org
>
> To subscribe or unsubscribe via email, send a message with subject or
> body 'help' to
> linaro-open-discussions-request(a)op-lists.linaro.org
>
> You can reach the person managing the list at
> linaro-open-discussions-owner(a)op-lists.linaro.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Linaro-open-discussions digest..."
>
> Today's Topics:
>
> 1. Re: [RFC PATCH v0.1 22/25] ACPI: add support to register CPUs based on the _STA enabled bit
> (Salil Mehta)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 12 Sep 2022 18:09:34 +0000
> From: Salil Mehta <salil.mehta(a)huawei.com>
> Subject: [Linaro-open-discussions] Re: [RFC PATCH v0.1 22/25] ACPI:
> add support to register CPUs based on the _STA enabled bit
> To: James Morse <james.morse(a)arm.com>,
> "linaro-open-discussions(a)op-lists.linaro.org"
> <linaro-open-discussions(a)op-lists.linaro.org>
> Cc: "lorenzo.pieralisi(a)linaro.org" <lorenzo.pieralisi(a)linaro.org>
> Message-ID: <65c52e8ba75e4cc59ec4b88a44c8a13b(a)huawei.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi James
>
>> From: James Morse [mailto:james.morse@arm.com]
>> Sent: Friday, September 9, 2022 5:53 PM
>> To: Salil Mehta <salil.mehta(a)huawei.com>;
>> linaro-open-discussions(a)op-lists.linaro.org
>> Cc: lorenzo.pieralisi(a)linaro.org; Jean-Philippe Brucker
>> <jean-philippe(a)linaro.org>; Jonathan Cameron <jonathan.cameron(a)huawei.com>
>> Subject: Re: [RFC PATCH v0.1 22/25] ACPI: add support to register CPUs based
>> on the _STA enabled bit
>>
>> Hi Salil,
>>
>> On 09/09/2022 15:53, Salil Mehta wrote:
>>>> From: James Morse [mailto:james.morse@arm.com]
>>>> Sent: Wednesday, August 31, 2022 12:09 PM
>>>> To: linaro-open-discussions(a)op-lists.linaro.org
>>>> Cc: Salil Mehta <salil.mehta(a)huawei.com>; james.morse(a)arm.com;
>>>> lorenzo.pieralisi(a)linaro.org; Jean-Philippe Brucker
>>>> <jean-philippe(a)linaro.org>
>>>> Subject: [RFC PATCH v0.1 22/25] ACPI: add support to register CPUs based on
>> the
>>>> _STA enabled bit
>>>>
>>>> acpi_processor_get_info() registers all present CPUs. Registering a
>>>> CPU is what creates the sysfs entries and triggers the udev
>>>> notifications.
>>>>
>>>> arm64 virtual machines that support 'virtual cpu hotplug' use the
>>>> enabled bit to indicate whether the CPU can be brought online, as
>>>> the existing ACPI tables require all hardware to be described and
>>>> present.
>>>>
>>>> If firmware describes a CPU as present, but disabled, skip the
>>>> registration. Such CPUs are present, but can't be brought online for
>>>> whatever reason. (e.g. firmware/hypervisor policy).
>>>>
>>>> Once firmware sets the enabled bit, the CPU can be registered and
>>>> brought online by user-space. Online CPUs, or CPUs that are missing
>>>> an _STA method must always be registered.
>>
>>>> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
>>>> index 1bd6e4b8ab66..42521d89c378 100644
>>>> --- a/drivers/acpi/acpi_processor.c
>>>> +++ b/drivers/acpi/acpi_processor.c
>>>> @@ -194,6 +194,32 @@ static int acpi_processor_make_present(struct
>>>> acpi_processor *pr)
>>>> return ret;
>>>> }
>>>>
>>>> +static int acpi_processor_make_enabled(struct acpi_processor *pr)
>>>> +{
>>>> + unsigned long long sta;
>>>> + acpi_status status;
>>>> + bool present, enabled;
>>>> +
>>>> + if (!acpi_has_method(pr->handle, "_STA"))
>>>> + return arch_register_cpu(pr->id);
>>>> +
>>>> + status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
>>>> + if (ACPI_FAILURE(status))
>>>> + return -ENODEV;
>>>> +
>>>> + present = sta & ACPI_STA_DEVICE_PRESENT;
>>>> + enabled = sta & ACPI_STA_DEVICE_ENABLED;
>>>> +
>>>> + if (cpu_online(pr->id) && (!present || !enabled)) {
>>>> + pr_err_once(FW_BUG "CPU %u is online, but described as not present
>> or
>>>> disabled!\n", pr->id);
>>>> + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
>>>> + } else if (!present || !enabled) {
>>>> + return -ENODEV;
>>>> + }
>>
>>> This change and setting all possible cpus as *present* in smp_prepare_cpus()
>>> will always cause all present == possible in the guest kernel.
>>
>> This is quite deliberate. I don't want to redefine present without a machine
>> that actually
>> supports hotplug/package-hotadd. This stuff is the tip of an ill-defined iceberg
>> in the
>> ACPI spec. Once there is hardware that supports this, we will have a better idea
>> of what
>> needs changing. Until then: everything described by ACPI must be present.
>
>
> Present mask operates on the logical cpuids. Later are more closely related to
> the Linux abstract model. I see no problem in masking certain available devices(in
> this case cpus) from upper user. This is done at many places inside the kernel to
> intentionally not/conditionally expose certain devices to user even after getting
> discovered at the boot time or later.
>
> As such, this change can co-exists irrespective of whether Hotplug or Hotadd will
> ever exist in the system.
>
> I agree with the ACPI part and maybe interface is broken but then you have used
> ACPI_STA_DEVICE_ENABLED which has not been used yet in acpi_processor.c code
> which is ACPI related. How can you make sure this bit is being set by firmware
> of other architectures, especially legacy?
>
>
>>> I think we
>>> can avoid that by the trick which Jean-Phillipe exploited in his patch-set[1]
>>> sent earlier last year.
>>
>> That was the other side of this:
>> https://gitlab.arm.com/linux-arm/linux-jm/-/commit/3106cccf5b9f01f44789b748
>> aaee3a95fee99a97
>>
>> This was an attempt to do all this without changes to the ACPI spec - it doesn't
>> touch the
>> present cpumask.
>
>
> Yes, I did refer those but the idea was not to use that change as it is.
>
>
>> [..]
>>
>>> This shall ensure that we correctly reflect only present vcpus to the linux
>>> kernel although the sizing and initialization of the GICC/GICR would have
>>> already happened for the complete set for possible vcpus i.e. the ones with
>>>
>>> [1] _STA[0] is set & _STA[1] bit is set and
>>> [2] Either GICC_flag_Intf_Flag.Enabled set OR GICC_flag.online_capable set
>>>
>>> so effectively we are only deferring populating the cpu present mask for the
>>> disabled cpus but which are now online capable(or Hotplug capable in future?)
>>
>> What is the user observable effect of the kernel knowing this CPUs are really
>> present?
>
> User Interface looks inconsistent and can break existing scripts.
>
> As you can see, user requested max possible cpus(=6) and cold booted cpus(=4)
> Hence, the number of cpus directories correctly being shown are 4 but then
> total number of cpus present are being shown as 6 (i.e. 0-5).
>
> If we can defer the registration of the disabled cpus (but are online capable
> i.e. for possible - present) then I don’t see why we can't mask availability
> of these cpus by not marking them as present to user so that the entries
> are consistent. With this scripts/utils using these values can go horribly
> wrong.
>
> At Guest Kernel
> ---------------
> estuary:/$ ls -al /sys/devices/system/cpu/
> total 0
> drwxr-xr-x 12 root 0 0 Sep 9 19:19 .
> drwxr-xr-x 8 root 0 0 Sep 9 19:19 ..
> drwxr-xr-x 7 root 0 0 Sep 9 19:19 cpu0
> drwxr-xr-x 7 root 0 0 Sep 9 19:19 cpu1
> drwxr-xr-x 7 root 0 0 Sep 9 19:19 cpu2
> drwxr-xr-x 7 root 0 0 Sep 9 19:19 cpu3
> drwxr-xr-x 2 root 0 0 Sep 9 19:19 cpufreq
> drwxr-xr-x 2 root 0 0 Sep 9 19:19 cpuidle
> [...]
>
> estuary:/$ cat /sys/devices/system/cpu/possible
> 0-5
> estuary:/$ cat /sys/devices/system/cpu/present
> 0-5
> estuary:/$ cat /sys/devices/system/cpu/offline
> 4-5
> estuary:/$
>
> At Qemu
> -------
> $QEMUBIN --enable-kvm -machine virt,gic-version=3 -cpu host -smp cpus=4,maxcpus=6
> -append "console=ttyAMA0 root=/dev/ram earlycon rdinit=/init maxcpus=4 acpi=force"
>
>
>> The intention of this series is to do this as pure policy.
>>
>> I anticipate pressure on the "use the MADT GICR" line, even though ACPI doesn't
>> say
>> anything about the presence of MADT GICC's redistributor entry. If this happens,
>> we'd
>> depend on present meaning present.
>
>
> If we are confident that flag ACPI_STA_DEVICE_ENABLED is being set properly by
> ARM and other architecture firmware, then Qemu can take care of that policy. It
> has all the information of the vcpus which are possible and disabled (but are
> online capable). We can use this info to conditionally return appropriate status
> when _STA ACPI method is evaluated.
>
> I intentionally refrained to use the this approach in my first RFC[1] as the
> default code in the acpi_processor.c was only making use of the
> ACPI_STA_DEVICE_PRESENT bit after evaluation of _STA method. Qemu was also
> setting only present bit in the returned status value. Plus, I wanted to
> minimize the changes in the kernel in the first version of the RFC.
>
>
> [1] https://lore.kernel.org/qemu-devel/38a034f82da78b8861af6d25a83fddea@kernel.…
>
>>
>> All the hotplug/package-hotadd machinery is triggered by udev. We don't need
>> to hack the
>> cpu present mask to make that work.
>
>
> May I know what exactly are your apprehensions with 'udev'?
>
> As such 'udev' should make use of the Linux device model and it is not necessary
> to present 1:1 picture of the hardware to the abstract model(and which by the way
> we are not doing by not registering the disabled cpus). It will just expose that
> limited picture of the hardware to the user whatever is being presented by the
> kernel.
>
> AFAICS it should work just fine but we need to limit the present cpus.
>
>
>>> Question:
>>> Q1: Current acpi_processor.c code is not using ACPI_STA_DEVICE_{ENABLED, UI}
>>> bits. Could it break other architecture if we use these bits but some of their
>>> legacy devices or firmware does not initialize these bits to their defaults?
>>
>> Almost certainly! I'm pretty confident some vendors generate their ACPI tables
>> using
>> markov-models. (It boots! Ship it!)
>>
>> The approach that used the UI bit to mean sysfs had to be hidden behind a Kconfig
>> symbol,
>> which is only marginally better than #ifdef CONFIG_ARM64.
>
>
> If there are problems in using the ACPI_STA_DEVICE_UI Bit because it might
> conflict with the legacy firmware of other architectures then let us drop that.
>
> We can alternatively use the ACPI_STA_DEVICE_ENABLED Bit in the _STA method
> which can be conditionally set by the Qemu?
>
>
>> This new version walks a fine line described in the cover-letter: any platform
>> with
>> firmware tables that get this wrong should get the same user-experience as there
>> is no
>> policy enforcement on x86, so the !online_capable CPUs can be detected as being
>> online,
>> and the policy stuff gets ignored.
>
> Yes, I do understand your predicament, but ideally user experience is dictated
> by what *end* user sees. Here, by not masking the disabled cpus in the cpu present
> mask user will not have similar experience on ARM64 and x86_64 platforms and that
> is undeniable and will in the end matter the most since this feature will mostly
> be used on the servers.
>
>
> Thanks
> Salil
>
> ------------------------------
>
> Subject: Digest Footer
>
> Linaro-open-discussions mailing list -- linaro-open-discussions(a)op-lists.linaro.org
> To unsubscribe send an email to linaro-open-discussions-leave(a)op-lists.linaro.org
>
>
> ------------------------------
>
> End of Linaro-open-discussions Digest, Vol 24, Issue 4
> ******************************************************
Hi all,
in the meeting held today we decided to go ahead as follows in relation
to ARM64 virt CPU hotplug patches and QEmu changes:
- James requires QEmu changes to test his branch[1]
- Huawei (Salil) agreed to upgrade the QEmu patches to the latest ACPI
specs (and James' code [1])
- QEmu updates will be given with a branch/link in reply to this email thread
so that James can complete [1] testing against them
- Any testing, bug report, communication will take place through this
mailing list before public posting on a kernel ML, so please keep an
eye on this thread if you'd like to collaborate/help
- It would be good to get some feedback from containers/kubernetes
developers on the full software stack - after all we are making these
changes to enable the ecosystem
Please chime in if I forgot something or reach out, all comments are
welcome.
Lorenzo
[1] https://gitlab.arm.com/linux-arm/linux-jm/-/tree/virtual_cpu_hotplug/rfc/v0
Linaro Open Discussions monthly meeting
Tuesday 23 Aug 2022 ⋅ 22:00 – 23:00
Hong Kong Standard Time
Location
https://linaro-org.zoom.us/j/95682500341https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fj%2F9568250…
Joyce QI 邀请您参加预先安排的 Zoom 会议。
加入 Zoom 会议
https://linaro-org.zoom.us/j/95682500341
会议号:956 8250 0341
手机一键拨号
+16699009128,,95682500341# 美国 (San Jose)
+13462487799,,95682500341# 美国 (Houston)
根据您的位置拨号
+1 669 900 9128 美国 (San Jose)
+1 346 248 7799 美国 (Houston)
+1 253 215 8782 美国 (Tacoma)
+1 646 558 8656 美国 (New York)
+1 301 715 8592 美国 (Washington DC)
+1 312 626 6799 美国 (Chicago)
888 788 0099 美国 免费
877 853 5247 美国 免费
会议号:956 8250 0341
查找本地号码:https://linaro-org.zoom.us/u/ady2J9Zn7t
Guests
lorenzo.pieralisi(a)arm.com
ilkka(a)os.amperecomputing.com
jonathan.cameron(a)huawei.com
salil.mehta(a)huawei.com
james.morse(a)arm.com
Mike Holmes
linaro-open-discussions(a)op-lists.linaro.org
View all guest info
https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiMWxv…
Reply for linaro-open-discussions(a)op-lists.linaro.org and view more details
https://calendar.google.com/calendar/event?action=VIEW&eid=Mjg1MjNrNWtiMWxv…
Your attendance is optional.
~~//~~
Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this email because you are an attendee of the event. To
stop receiving future updates for this event, decline this event.
Forwarding this invitation could allow any recipient to send a response to
the organiser, be added to the guest list, invite others regardless of
their own invitation status or modify your RSVP.
Learn more https://support.google.com/calendar/answer/37135#forwarding
Hi folks,
I would like to take advantage of the upcoming call on tuesday to
resume the virt CPU hotplug topic and sync-up on that.
It would probably be great if we can make the time more US friendly,
given that there are folks who want to join from the US but I understand
it then becomes problematic for other people to join.
Please let me know if we can have a meeting on tuesday and at
what time.
Thanks,
Lorenzo
Hi Jonathon,Lorenzo,
Do we have some topic to discuss for the Linaro-open-disscussions next week?
Thanks:)
Joyce
> 在 2022年4月14日,上午8:00,linaro-open-discussions-request@op-lists.linaro.org 写道:
>
> Send Linaro-open-discussions mailing list submissions to
> linaro-open-discussions(a)op-lists.linaro.org
>
> To subscribe or unsubscribe via email, send a message with subject or
> body 'help' to
> linaro-open-discussions-request(a)op-lists.linaro.org
>
> You can reach the person managing the list at
> linaro-open-discussions-owner(a)op-lists.linaro.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Linaro-open-discussions digest..."
>
> Today's Topics:
>
> 1. Re: LOD Call notes: 22 March 2022 - vCPU Hotplug Update
> (Jonathan Cameron)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 13 Apr 2022 19:17:32 +0100
> From: Jonathan Cameron <Jonathan.Cameron(a)Huawei.com>
> Subject: [Linaro-open-discussions] Re: LOD Call notes: 22 March 2022 -
> vCPU Hotplug Update
> To: Ilkka Koskinen <ilkka(a)os.amperecomputing.com>
> Cc: linaro-open-discussions(a)op-lists.linaro.org
> Message-ID: <20220413191732.00000e4e(a)Huawei.com>
> Content-Type: text/plain; charset="US-ASCII"
>
> On Thu, 31 Mar 2022 11:47:26 -0700 (PDT)
> Ilkka Koskinen <ilkka(a)os.amperecomputing.com> wrote:
>
>> Hi everyone,
>>
>> On Tue, 22 Mar 2022, Jonathan Cameron via Linaro-open-discussions wrote:
>>> Hi All,
>>>
>>> A quick set of notes on the discussion on vCPU hotplug.
>>>
>>> Great to have Ed join the discussion. If we are going to
>>> have further calls on this topic we may want to move the time to be
>>> more friendly for the US.
>>>
>>> Current status
>>> * ACPI spec change via code first route approved.
>>> * Kernel patches being reworked / rebased by ARM. Expected to be sent
>>> out in 5.19 cycle.
>>> * QEMU patches being updated by Huawei. Will do at least some light testing
>>> and then push out a public git tree so that others can test - will aim
>>> to do this so it aligns with kernel code availability.
>>> * Ed (Ampere) gave an update to say they are interested in pushing this
>>> forwards ASAP and have customer / OS vendor engagement which will be
>>> very useful in moving towards a complete solution, particularly ensuring
>>> good test coverage etc.
>>>
>>> Noted that current QEMU patches may not cover all corner cases, for example
>>> live migration of VMs that have vCPUs hotplugged. Might 'just work'
>>> but we haven't tested it yet so probably not.
>>>
>>> Given we didn't really add anything on DOE / SPDM over previous calls,
>>> I don't plan to send out anything on that topic this time.
>>>
>>> Thanks to Joyce for hosting the call. If I noted down anything wrong,
>>> or incomplete let the list know.
>>>
>>> Jonathan
>>> --
>>> Linaro-open-discussions mailing list -- linaro-open-discussions(a)op-lists.linaro.org
>>> https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Home
>>
>> Thanks for the great meeting and the update! As Ed mentioned (and you
>> wrote above), we're interested in joining the vCPU Hotplug development.
>
> Hi Ilkka
>
> Great to have you on board for this.
>
> Status wise, on QEMU side we have the new interface up and running
> (it was a whole 2 lines of code once we'd dealt with some rebasing related issues).
>
> Rather more changes were needed to make it work sensibly on top of Gavin Shan's
> QEMU topology configuration patch set.
> https://lore.kernel.org/qemu-devel/20220403145953.10522-1-gshan@redhat.com/
> The code going through some light internal review before we put it up somewhere
> public. Note it's definitely not production quality code but it is somewhere
> to start from. Given Easter related vacations I doubt we'll get enough
> eyes on it until next week.
>
> We were thrown briefly by the fact there is a recently added
> apcica/actbl2.h entry for MADT_ONLINE_CAPABLE but it's a different bit.
> Seems x86 folk wanted something similar last year. We'll have to do something
> ugly in ACPICA to mangle the name for the new bit to avoid that collision.
> Lorenzo, I'm assuming the ACPCIA one line patch to add that define is
> something the ARM team will deal with?
>
> There are some known limitations though that we'll need to sort out before
> upstreaming the QEMU support. One of which is SVE currently breaks things.
> Plenty of time though as kernel needs to be upstream first anyway.
>
> Testing wise we are using QEMU on top of QEMU so we can poke corners of the
> architecture don't have hardware for (e.g. SVE :). Even on a rubbish x86
> desktop it's not that slow :)
>
> Lorenzo, any update on kernel side of things or expected time scale for
> more information?
>
> Obviously we have some hacked patches based on Salil's original proposal
> that sanity check the QEMU side of things but I suspect the final version
> will look rather different :)
>
> One question on the spec change for Lorenzo. It's not entirely clear
> but I think we should not be using _MAT when 'hotplugging' the vCPUs?
> For now the QEMU code provides the relevant entries anyway but
> it would be nice to drop that if not necessary (it's not a huge
> amount of code or complexity though so not a big thing either way).
>
> Thanks and an early happy Easter to those celebrating on Sunday.
>
> Jonathan
>
>>
>> Br, Ilkka
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> Linaro-open-discussions mailing list -- linaro-open-discussions(a)op-lists.linaro.org
> To unsubscribe send an email to linaro-open-discussions-leave(a)op-lists.linaro.org
>
>
> ------------------------------
>
> End of Linaro-open-discussions Digest, Vol 19, Issue 1
> ******************************************************