kernelcifoundation, Author at KernelCI Foundation

Mar 16

Looking back, looking forward

By kernelcifoundation Blog, Community

2020 was the first year of the KernelCI project under the Linux Foundation and has been an interesting one. Maybe slightly less “interesting” than the rest of the world-changing events of 2020, but it’s still been an adventure. This article aims to give a quick summary of the major milestones of the first year of KernelCI project, and highlight our goals for the next year.

Highlights of the first year

The founding members spent the end of 2019 doing a formal launch and ramping up the project structure and organization. This led to our mission statement and key goals. Throughout 2020 we gave talks and led discussions at several virtual conferences such as FOSDEM, Open-Source Summit / Embedded Linux Conference (ELC). Check out our blog for more details about the talks and discussions from these events.

Community Collaboration

In the middle of 2020, we did a Community survey to get a sense for what the kernel testing and automation community was looking for. This survey has helped guide where we focus our time and resources. See our blog for an article covering the full results of the survey.

One highlight of the 2020 conference circuit was Linux Plumbers Conference (LPC). At LPC, we gave talks and held focused discussions with our target audience: kernel developers and maintainers. The full details are in a blog article covering the event , but this is where we kicked off public discussions of how to unify test results and reporting from various testing and CI efforts in the community. We’re calling this common datastore for kernel testing results kcidb. Thanks to the discussions kicked off at LPC, we’re now collecting results from several other projects such as Red Hat CKI, Google syzbot, Arm, Gentoo, and the Fuego project. Continued collaboration with these projects as well as other new ones will be a focus area for 2021.

Infrastructure

Another area of growth in 2020 was in our IT infrastructure. As you might expect, we do lots of kernel builds, and that requires lots of compute horsepower. Our build capacity had been capped by a fixed number of donated build machines. But now, thanks to the generous donations of founding members Google and Microsoft, we now have scalable cloud compute resources under Google Compute Platform (GCP) and Microsoft Azure which we manage with Kubernetes so that we can dynamically scale as our compute needs grow.

Looking forward

Kicking off 2021

The governing board kicked off our 2nd year with some project organizational matters such as budgeting and electing this year’s executive committee. We are very happy to welcome Guillaume Tucker (Collabora) as the new board chair and Chris Paterson (CIP/Renesas) as the new treasurer. We also say a big thank you to outgoing chair Kevin Hilman (BayLibre) and outgoing treasurer Guy Lunardi (Collabora).

Focus areas

Data

As mentioned above, the collaboration with other testing and CI projects will remain a major focus for 2021. We want it to be easy for anyone doing kernel testing to be able to submit results to our open, centralized datastore: KCIDB. The amount of data we’re collecting is growing rapidly, so we’re also looking for help from “big data” experts to help us build the tooling to visualize and learn from all the data. Please write to us on the mailing list if you have any interest in helping here.

Infrastructure

We’ve been using Jenkins for years to manage our CI pipeline jobs, but as we’ve moved more of our infrastructure into the cloud, we’re looking at ways to migrate our CI infrastructure to a cloud-native framework such as Tekton or Jenkins-X. We’re in the early stages of exploration here, so anyone with experience here that could help guide us, we’d love to hear from you!

Data Visualization & Analysis

We’re also in the early stages of planning new dashboards for visualization and analysis of our growing data set. We’re soliciting feedback from the broader community by collecting user stories to better understand what our users want from new dashboards. In addition to making all the testing data and logs available through advanced visualization tools, we’d also like to enable analytics and deep learning on our growing data set. Once again, this is something we’d love your help on, so if you’re a big data enthusiast and want to put your skills to use to help the Linux kernel, please let us know.

Get involved!

Did you notice any themes above? We’re looking for help! We have some big ideas and plans, but we’re still a very small team and are looking for expertise in a few areas to help guide the future of the project.

Please keep in touch with what we’re up to or to get involved, you can read our blog, follow on twitter @kernelci or join our mailing list.

Nov 05

Love0

Notes from OSS/ELC Europe 2020

By kernelcifoundation Blog, Community

The OSS/ELC Europe 2020 conference took place online from 26th to 28th October. There was one BoF session and one talk about KernelCI, followed by an impromptu video call. The notes below were gathered based on these events.

BoF: Lessons Learned

Guillaume Tucker, Collabora

A lot has happened since KernelCI was announced as a new Linux Foundation project at ELC-E 2019 in Lyon. One year on, what have we learnt?

See the full Event description for slides and more details. Below are a list of Q&A gathered from the session.

Q: I wonder if you plan to add any subsystem-specific CI? Are there any plans/ideas? e.g. for scsi drivers

There are already subsystem-specific tests being run, and subsystem branches can be monitored. Then results can be sent to subsystem mailing lists. For example, this is the case with v4l2: kernelci.org runs v4l2-compliance on a number of platforms for several branches including the media tree, mainline, stable and linux-next, and sends reports with regressions.

There should not be any subsystem-specific infrastructure needed on kernelci.org, but rather different tests and maybe different parameters to adjust to the workflows according to maintainers’ needs.

Q: Some time ago there was a way to search for test runs in a specific lab. I mean on the dashboard. But it seems this feature is gone now. Was that intended? Is it coming back? Can we help and contribute here? 🙂

The web frontend was scaled down to accommodate for functional testing rather than boot testing. This was because all the boot testing search pages were tailor-made, which doesn’t scale very well and is very hard to maintain.

We’re now looking into a fresh web dashboard design with flexible search features to be able to do that. As a first step, we are collecting user stories. If you have any, such as “I want to find out all the test results for the devices in my lab”, feel free to reply to this thread:
https://groups.io/g/kernelci/topic/rfc_dashboards/77367531
“RFC: dashboards, visualization and analytics for test results”

Q: What is the relationship between KernelCI project and LAVA project? Does KernelCI have non-upstream changes to LAVA? Do LAVA people participate in KernelCI?

LAVA is used in many test labs that provide results to KernelCI, but KernelCI doesn’t run any labs itself. Some people do contribute to both, as KernelCI is one of the biggest public use-cases of LAVA, but they really are independent projects. The core KernelCI tools are designed to facilitate working with LAVA labs, but this is not a requirement and other test lab frameworks are also used.

Q: Is there any documentation on how to write those “custom” tests and to integrate it with KernelCI? (e.g. the SCSI drivers/storage devices you just mention before)

See Khouloud Touils’ talk Let’s Test with KernelCI with some hands-on examples.

There is also the user guide as part of the KernelCI documentation:
https://github.com/kernelci/kernelci-core/blob/master/doc/kci_testsuite.md

Each test is a bit different as they all have their own dependencies and are written in various languages. Typically, they will require a user-space image with all the required packages installed to be able to run as well as the latest versions of some test suites built from source. This is the case with v4l-utils, igt-gpu-tools or LTP. Some are plain scripts and don’t depend on anything in particular, such as bootrr.

When prototyping some new tests to run in LAVA, the easiest approach is to use nfsroot with the plain Debian Buster image provided by KernelCI and install extra packages at runtime, before starting the tests. Then when this is working well, dependencies and any data files can be baked into a fixed rootfs image for performance and reproducibility.

Q: How to properly deal with boards which are able to boot only from a mass-storage device and prevent them from being stuck with a non-working image?

In order to be useful with KernelCI, it’s required to at least be able to dynamically load the kernel image as well as any modules and device tree with a ramdisk for the tests that fit in a small enough image. If this can’t be done, then the kernel and user-space images need to be written to the persistent storage before each job. It might also be possible to load the kernel over TFTP and then extract the image onto the persistent storage and use it as a chroot. Ultimately this is the lab’s responsibility and it will depend on many things. If the kernel and the user-space can’t be changed at all, or if there is a possibility of bricking the device, then it’s basically not practical to do any CI on such a platform.

Let’s Test with KernelCI

Khouloud Touil, Baylibre

A growing number of Linux developers want to use KernelCI to run their test suites, but there’s a bit of a learning curve for how to make test suites work with KernelCI. “Let’s Test with KernelCI” will give an overview of the ways to integrate test suites and/or test results into the KernelCI modular pipeline.

See the full Event description for more details. Below are a list of Q&A gathered from the session.

Q: Is there also support for custom YP/OE distros or is it currently limited to the usage of predefined kernels and file systems?

The kernels are all built with regular “make”, not any packaging or yocto recipe is supported right now. However, that could be done with a bit of plumbing. Then for user-space, kernelci only really tests the kernel: the buildroot and debian images are only there to be able to run kernel tests. If you create your own KernelCI instance, you can run tests with your own user-space built using Yocto and extend testing to cover some user-space if you want.

Q: Is there some kind of test config to require a certain kernel flag active? I am basically thinking about running some pre-defined test base, based on my own kernel config and then report back the results with something like “ran test X, which requires kernel config flag Y, on architecture/platform Z on kernel version V”.

Yes, there are a couple of ways to adjust the kernel config on kernelci.org. One way is with a special syntax like defconfig+CONFIG_SOMETHING=y. Another way is to define a config fragment. Each KernelCI test result will have the information you mentioned as meta-data.

Q: Which firewall streams must be permitted in order for KernelCI to use a custom Lab? I mean if we want to contribute a lab (with associated boards) to KernelCI.org?

LAVA exposes a REST API over HTTPS. It’s also possible to have the LAVA server hosted publicly and using LAVA dispatchers in a private network which will be connecting to the server as clients, with no incoming connections.

When not using LAVA, you can also periodically poll storage.kernelci.org for new kernel builds to appear, and download them to test them then send results to api.kernelci.org. In this case, no incoming connections are required either.

Q: In real life how are tests that need to check hardware I/O done? For example in your audio playback case it’s probably not enough to run the play command but we want to check that something was actually played e.g. by capturing the output.

For audio (and video), some hardware has loopback devices which can be used to compare against expected output. For more advanced setup, labs can have external capture equipment as well. But this ends up to be lab-specific since there are many ways to do it.

Follow-up impromptu video discussion

As we neared the end of the time slot for the “Let’s Test with KernelCI” talk, we decided to start a public video call with anyone who was interested and attended the talk. We discussed various general things about the project, and a few notes and Q&A were captured:

Q: How can a test lab get added to kernelci.org?

This is something that would require better documentation. We can distinguish 3 different “levels” of integration for labs:

LAVA-style: fully integrated into the pipeline
If you have a LAVA lab, it’s the easiest way to contribute test results to KernelCI. It also enables automated bisection and is the most efficient way of getting tests run.
Asynchronous test lab
If you have a test lab with no way to receive requests to run tests, you can look for kernelci.org builds to appear and submit results with kci_data. A typical example is Labgrid. One way to improve this is to implement some notification protocol so these labs could avoid polling and get requests to run tests like the LAVA labs.
Autonomous CI system: KCIDB
With options 1 and 2, tests use kernel builds from kernelci.org and report results to the same backend. This is called the “native” KernelCI tests. Option 3 is for full CI systems creating their own kernel builds and running their own set of tests. The results are sent to the common reporting database using the KCIDB tools.

Q: Where can we find the source and definition of tests visible on kernelci.org frontend?

This is also something that would require better documentation, with a directory of all the test plans and how they are created. Functional tests are fairly recent on kernelci.org, which is why we don’t have that yet.

All the tests are normally defined in the kernelci-core repository. This includes building some test suites from source and including them in user-space rootfs images, and defining how to run the tests.

User story: Checking results for devices in “my” lab across all the branches and revisions.

Sep 23

Love0

KernelCI Notes from Plumbers 2020

By kernelcifoundation Blog, Community

The Linux Plumbers Conference 2020 was held as a virtual event this year. The online platform provided a really good experience, with talks and live discussions using Big Blue Button for the video and Rocket Chat for text-based discussions. KernelCI was mentioned many times in several micro-conferences, with two talks in Testing & Fuzzing which are now available on YouTube:

The notes below were gathered publicly from a number of attendees, they give a good insight into what was discussed. In short, while there is still a lot to be done, the KernelCI project is healthy and growing well in its role of a central CI system for the upstream Linux kernel.

Real-Time Linux

We’ve been making great progress with running LAVA jobs using the test-definitions repository from Linaro, thanks to Daniel Wagner’s help in particular. This was prompted by the discussions in the real-time micro-conference.

Initial GitHub PR: https://github.com/kernelci/kernelci-core/pull/475
Results on staging: v5.4.61-rt37

The next steps from a KernelCI infrastructure point of view is to be able to detect performance regressions, as these are different to binary pass/fail results. KernelCI can already handle measurements, but not yet compare them to detect regressions. Real-time getting merged upstream means it is becoming increasingly important to be able to support this.

There was also an interesting talk about determining the scheduler latency when using RT_PREEMPT and the introduction of a new tool “rtsl” to trace real-time latency. This might be an interesting area to investigate and potentially run automated tests with:

Static Analysis

The topic of static analysis and CI systems came up during the Kernel Dependability MC, and in particular, they were looking for a place to do “common reporting” in order to collect results for the various types of static analysis and checkers available. We pointed them to the KernelCI common reporting talks/BoFs.

Some static analysis can also be done by KernelCI “native” tests using the kernelci.org Cloud infrastructure via Kubernetes, which is currently only used to build kernels. This is probably where KUnit and devicetree validation will be run, but the rest still needs to be defined.

KCIDB

Fuego

Tim Bird, the main developer of Fuego at SONY, started joined the KCIDB BoF and we had a good discussion. Unfortunately he had not enough time to go through to an actual submission. We got about a quarter way through converting his mock data to KCIDB.

Gentoo Kernel CI

Alice Ferrazzi, maintainer of GKernelCI at Gentoo, had more time available for the KCIDB BoF and we talked through getting the data out of her system. A mockup of her data was made and successfully submitted to the KCIDB playground database setup.

Intel

Tim Orling, Yocto project architect at Intel, has expressed keen interest in KCIDB. He said he would experiment at home and will push Intel internally to participate.

LLVM/Clang

The recently added support for “LLVM=1” upstream means we can now have better support for making Clang builds. In particular, this means we’re now using all the LLVM binaries and not just clang. It also solved the issue with merge_config.sh and the default CC=gcc in the top-level Makefile.

This was enabled in kernelci.org shortly after LPC.

kselftest

The first kselftest results were produced on staging.kernelci.org during Plumbers as a collective effort. We have now started enabling them in production, so stay tuned as they should soon start appearing on kernelci.org.

Initial set of results: https://kernelci.org/test/job/next/branch/master/kernel/next-20200923/plan/kselftest/

AutoFDO

AutoFDO will hopefully get merged upstream, once it is it might be useful for CI systems to share profiling data from benchmarking runs in particular.

Building randconfig

The TuxML project carries out some research around Linux kernel builds: determining the build time, what can be optimised, which configurations are not valid… The project could benefit from the kernelci.org Cloud infrastructure to extend its build capacity while also providing more build coverage to kernelci.org. This could be done by detecting kernel configurations that don’t build or lead to problems that can’t be found with the regular defconfigs.

Using tuxmake

The goal of tuxmake is to provide a way to reproduce Linux kernel builds in a controlled environment. This is used primarily by LKFT, but it should be generic enough to cover any use-case related to building kernels. KernelCI uses its kci_build tool to generate kernel configurations and produce kernel builds with some associated meta-data. It could reuse tuxmake to avoid some duplication of effort and only implement the KernelCI-specific aspects.

Aug 21

Love0

Introducing Common Reporting

By kernelcifoundation Blog

During the past year we’ve been working on a way to bring reports from all the separate kernel testing systems together. Our aim is to send a single report email…

Aug 19

Love0

KernelCI at Linux Plumbers

By kernelcifoundation Blog, Community, News

KernelCI will be attending Linux Plumber’s Conference (LPC) online next week. We will be participating in the Testing and Fuzzing micro-conference, and leading a few discussion topics around automated kernel testing and reporting in the broader kernel community.

Jul 09

Love0

KernelCI Community Survey Report

By kernelcifoundation Blog

We are thrilled to share with you the results of our first KernelCI Community Survey. It has been a very interesting experience, with just under 100 responses from people who all provided quality feedback. We are really thankful for every single one of them. It was also a great way to engage more widely with the community. The full results are available for everyone to see in a shared spreadsheet. Individual comments are not shared publicly although they are very valuable and will be taken into account.