Skip to main content
Category

Blog

Now KernelCI has a dedicated SysAdmin!

By Blog, Community, News

Recently, we posted about the Request For Proposals to undertake SysAdmin tasks for KernelCI.org. That process is now complete, and we are delighted to announce that the KernelCI advisory board hired Vince Hillier from Revenni Inc to conduct the work.

Vince will work together with the Technical Steering Committee (TSC) to maintain and improve the current project infrastructure. As KernelCI.org scales with more kernels being built and more tests being run, we definitely need help to keep our systems stable and up to date.

The KernelCI project relies on a number of web services which need constant maintenance. These include databases, automation tools and web dashboards for several instances. Some are hosted on dedicated virtual machines (VMs), others in the cloud.

Request for Proposals: Sysadmin Maintenance 2022

By Blog, Community, News

Summary

The KernelCI project relies on a number of web services which need constant maintenance. These include databases, automation tools and web dashboards for several instances. Some are hosted on dedicated virtual machines (VMs), others in the cloud. This Request for Proposals seeks to extend the current team with dedicated sysadmins to ensure the maintenance of these services is being carried out to guarantee a good quality of service. Additionally, some improvements can be made to reduce the maintenance burden.

Budget

We’re expecting quotations for this work package to range between 15,000 and 30,000 USD depending on the contents of the proposal and for a period of six months. Longer sysadmin time available and extra improvements can justify a higher price. Payments would be made via the Linux Foundation using the project’s own budget.

Sending proposals

Please take a look at the full RFP-Sysadmin-Maintenance-2022-v3.pdf document for all the details. Proposals should be sent by email directly to the project members.

The deadline for responding to this RFP is 8th August 2022, six weeks after it has been made public. Then the KernelCI Advisory Board of Members will vote on the 24th August 2022. Exact dates might be subject to change in case of a major practical issue or unavailability of voting members.

KernelCI Hackfests

By Blog, Community

KernelCI hackfests span over a few days during which a number of contributors get together to focus on upstream Linux kernel testing. So far, mainly kernel and automated test system developers have been taking part in the hackfests but anyone is welcome to join. Topics mostly include extending test coverage in various ways: enabling new test suites as well as adding test cases to established frameworks such as kselftest and LTP, building additional kernel flavours, bringing up new types of hardware to be tested…

There have been two hackfests to date. The current plan is to hold one every few months. Future hackfests will be announced on the KernelCI mailing list as well as LKML and Twitter. Stay tuned!

Connecting the dots

There is a large ecosystem around the Linux kernel which includes testing in many shapes and forms: kernel developers, test developers, test system developers, OEMs testing fully integrated products… All these teams of people don’t necessarily interact with each other very much outside of their organisations, and kernel developers aren’t necessarily in the habit of writing tests as part of their daily work.

Events such as the KernelCI hackfests give a chance for people from these different areas to work together on solving common issues and keep the ecosystem healthy. It also helps with shifting the upstream Linux kernel development culture towards a more test-driven workflow, to bring mainline Linux closer to the real world where it is actually being used.

Let’s consider a plausible hackfest story. A first participant writes a new test, say in kselftest. A second participant enables the test to run in KernelCI, which fails in some cases and a kernel bug is found. A third participant makes a fix for the bug, which can then be tested directly in KernelCI to confirm it works as expected. This may even happen on the staging instance before the patches for the test and the fix are sent to any mailing list, in which case the fix would get a Tested-by: "kernelci.org bot" <bot@kernelci.org> trailer from the start. This scenario also relies on some hardware previously made available in test labs by other people, putting together efforts from at least four different participants.

Timeline

Here’s a summary of the first two hackfests:

Hackfest #1 – 27th May to 4th June 2021

  • Workboard: https://github.com/orgs/kernelci/projects/3
  • Participans
    • Several kernel developers from Google Chrome OS
    • Several kernel developers from Collabora
    • A few members of the core KernelCI team
  • Achievements
    • New KernelCI instance for Chrome OS on https://chromeos.kernelci.org
    • Added support for building Chrome OS configs on top of mainline
    • Enabled clang-13 builds for Chrome OS and main KernelCI builds
    • New test suite for libcamera enabled on https://linux.kernelci.org
    • Enabled LTP crypto tests with extra kernel config fragment
    • Several patches were also sent to extend kselftest and KUnit coverage in the kernel tree

Hackfest #2 – 6th to 10th September 2021

Lessons Learned

What went well

  • There were several new contributors. The hackfest is a great way to get people started with KernelCI.
  • Hackfest #2 showed more diversity with a wider representation from the ecosystem.
  • There were many improvements in various areas (bug fixes, documentation) which is a sign of a healthy project.
  • The workflow based on GitHub and the Big Blue Button platform appear to have been easily adopted by the participants.

What needs to be improved

  • Having more kernel maintainers involved would help with setting priorities and ojectives for KernelCI in accordance with kernel developers’ needs.
  • A hackfest every 3 months may be a bit too often, or maybe some could be shorter or have a more particular theme.
  • Changes can take a long time to get merged. The main limitation seems to be the number of people available to do code reviews and drive discussions.
  • KernelCI has been focusing on running more tests for over a year (kselftest, LTP, IGT…). Now the core architecture needs to be improved to scale better.

Next hackfest

The actual dates haven’t been confirmed yet, but with the current 3-month frequency the next hackfest should be taking place early December 2021. A proposed theme could be “KernelCI for newbies”, with a selection of tasks well suited for first-time contributors and documentation improved in that area prior to the hackfest. As always, suggestions are always welcome so please do get in touch if you have any.

See you there!

The first ever KernelCI hackfest

By Blog, Community

The first KernelCI test development and coverage hackfest took place from 27th May to 4th June 2021. For a total of seven days, developers from the KernelCI team, Google, and Collabora worked to improve many different aspects of KernelCI testing capabilities.

The hackfest was a community event promoted by the KernelCI team. It aimed at bringing developers and companies together to improve testing for areas of the Linux kernel they care about. Through this effort, the KernelCI team also expects to increase awareness for continuous kernel testing and validation – more hackfests will happen in the future, so stay tuned if you want to join.

The first-ever KernelCI hackfest was a success. It kicked off the work to enable kernel testing through Chromium OS, a product-specific userspace. Enabling full userspace images and real-world tests like video call simulations adds a lot of complexity to the testing process. However, the benefits are a clear win for the community. They allow a more thorough kernel testing and validation through real application use cases, which can exercise several different kernel areas at the same time in an organized manner. Generally, it is not simple for lower-level kernel test suites like kselftests or LTP to orchestrate a similar use case.

Consider video call simulation for example. Once the user starts a video call, the test can begin by measuring the time needed to set up the video feed and show it in the browser rendered with the rest of the video call website. Then, as soon as the video call is up, many other measurements can be made: camera capture latency, camera stream to network latency, memory consumption, power consumption, GPU performance, background tasks latency, and user interaction latency. These types of tests stress the kernel in unique ways, exposing problems that might otherwise go unnoticed from release to release.

The support for the Chromium OS userspace is the start of full-stack tests in KernelCI. It is still quite experimental, but the support will evolve over the next few months, opening the path for other product-specific userspaces. Increased kernel testing diversity will definitely result in catching more regressions earlier. “Keeping upstream healthy is really important to us in the Chrome OS team (and Google broadly!) since we constantly pull in stable fixes and regularly push out major kernel version upgrades to our users.” said Jesse Barnes who leads the Base OS team of Chrome OS at Google.

On another front, there was progress enabling more testing for different kernel areas as well as improvements to the rootfs testing images used by KernelCI:

  • kselftests received new tests for the futex() system call, basic semantics validation, and soft-dirty page table entry mechanism corner cases;
  • improved LTP crypto tests by enabling missing kernel configs needed;
  • libcamera now has its first few tests running on KernelCI;
  • fs/unicode tests converted to the new Kunit mechanism;
  • bootr test to check if all CPUs went online successfully;
  • experimental support for including firmware files in the rootfs.

The overall results were significant for only a few days of work. Kernel testing through product-specific userspace opens a whole new avenue of possibilities for KernelCI. On top of that, there was an accomplishment for many test cases and test suites, as detailed above. “It has been an amazing week and a half. We’ve achieved a lot in such a short time, in spite of a few workflow weaknesses which we’re now addressing to help further grow the KernelCI community with new developers.” said Guillaume Tucker, KernelCI project lead and Senior Software Engineer at Collabora.

More hackfests will come in the future – this was only the first one. Dates are being discussed for the next hackfest to happen around the end of August, a few weeks before the Linux Plumbers Conference. The KernelCI team invites developers and companies to participate. Joining a hackfest is a great way to quickly evolve the kernel testing knowledge of your team, leading to products and services that work better with easier maintenance over time. So make sure to raise your hand and join the next time a KernelCI hackfest happens.


KernelCI is a Linux Foundation project. If you are interested in learning more or becoming a member contact us.

Looking back, looking forward

By Blog, Community

2020 was the first year of the KernelCI project under the Linux Foundation and has been an interesting one.  Maybe slightly less “interesting” than the rest of the world-changing events of 2020, but it’s still been an adventure.  This article aims to give a quick summary of the major milestones of the first year of KernelCI project, and highlight our goals for the next year.

Highlights of the first year

The founding members spent the end of 2019 doing a formal launch and ramping up the project structure and organization. This led to our mission statement and key goals.  Throughout 2020 we gave talks and led discussions at several virtual conferences such as FOSDEM, Open-Source Summit / Embedded Linux Conference (ELC).  Check out our blog for more details about the talks and discussions from these events. 

Community Collaboration

In the middle of 2020, we did a Community survey to get a sense for what the kernel testing and automation community was looking for.  This survey has helped guide where we focus our time and resources.  See our blog for an article covering the full results of the survey.

One highlight of the 2020 conference circuit was Linux Plumbers Conference (LPC). At LPC, we gave talks and held focused discussions with our target audience: kernel developers and maintainers.  The full details are in a blog article covering the event , but this is where we kicked off public discussions of how to unify test results and reporting from various testing and CI efforts in the community.  We’re calling this common datastore for kernel testing results kcidb.  Thanks to the discussions kicked off at LPC, we’re now collecting results from several other projects such as Red Hat CKI, Google syzbot, Arm, Gentoo, and the Fuego project.  Continued collaboration with these projects as well as other new ones will be a focus area for 2021.

Infrastructure

Another area of growth in 2020 was in our IT infrastructure.  As you might expect, we do lots of kernel builds, and that requires lots of compute horsepower.  Our build capacity had been capped by a fixed number of donated build machines.  But now, thanks to the generous donations of founding members Google and Microsoft, we now have scalable cloud compute resources under Google Compute Platform (GCP) and Microsoft Azure which we manage with Kubernetes so that we can dynamically scale as our compute needs grow.

Looking forward

Kicking off 2021

The governing board kicked off our 2nd year with some project organizational matters such as budgeting and electing this year’s executive committee.  We are very happy to welcome Guillaume Tucker (Collabora) as the new board chair and Chris Paterson (CIP/Renesas) as the new treasurer.  We also say a big thank you to outgoing chair Kevin Hilman (BayLibre) and outgoing treasurer Guy Lunardi (Collabora).  

Focus areas

Data

As mentioned above, the collaboration with other testing and CI projects will remain a major focus for 2021.  We want it to be easy for anyone doing kernel testing to be able to submit results to our open, centralized datastore: KCIDB.  The amount of data we’re collecting is growing rapidly, so we’re also looking for help from “big data” experts to help us build the tooling to visualize and learn from all the data.  Please write to us on the mailing list if you have any interest in helping here.

Infrastructure

We’ve been using Jenkins for years to manage our CI pipeline jobs, but as we’ve moved more of our infrastructure into the cloud, we’re looking at ways to migrate our CI infrastructure to a cloud-native framework such as Tekton or Jenkins-X.  We’re in the early stages of exploration here, so anyone with experience here that could help guide us, we’d love to hear from you!

Data Visualization & Analysis

We’re also in the early stages of planning new dashboards for visualization and analysis of our growing data set.  We’re soliciting feedback from the broader community by collecting user stories to better understand what our users want from new dashboards.  In addition to making all the testing data and logs available through advanced visualization tools, we’d also like to enable analytics and deep learning on our growing data set.  Once again, this is something we’d love your help on, so if you’re a big data enthusiast and want to put your skills to use to help the Linux kernel, please let us know.

Get involved!

Did you notice any themes above?  We’re looking for help!  We have some big ideas and plans, but we’re still a very small team and are looking for expertise in a few areas to help guide the future of the project.

Please keep in touch with what we’re up to or to get involved, you can read our blog, follow on twitter @kernelci or join our mailing list.

Notes from OSS/ELC Europe 2020

By Blog, Community

The OSS/ELC Europe 2020 conference took place online from 26th to 28th October. There was one BoF session and one talk about KernelCI, followed by an impromptu video call. The notes below were gathered based on these events.

BoF: Lessons Learned

Guillaume Tucker, Collabora

A lot has happened since KernelCI was announced as a new Linux Foundation project at ELC-E 2019 in Lyon. One year on, what have we learnt?

See the full Event description for slides and more details. Below are a list of Q&A gathered from the session.

Q: I wonder if you plan to add any subsystem-specific CI? Are there any plans/ideas? e.g. for scsi drivers

There are already subsystem-specific tests being run, and subsystem branches can be monitored. Then results can be sent to subsystem mailing lists. For example, this is the case with v4l2: kernelci.org runs v4l2-compliance on a number of platforms for several branches including the media tree, mainline, stable and linux-next, and sends reports with regressions.

There should not be any subsystem-specific infrastructure needed on kernelci.org, but rather different tests and maybe different parameters to adjust to the workflows according to maintainers’ needs.

Q: Some time ago there was a way to search for test runs in a specific lab. I mean on the dashboard. But it seems this feature is gone now. Was that intended? Is it coming back? Can we help and contribute here? 🙂

The web frontend was scaled down to accommodate for functional testing rather than boot testing. This was because all the boot testing search pages were tailor-made, which doesn’t scale very well and is very hard to maintain.

We’re now looking into a fresh web dashboard design with flexible search features to be able to do that. As a first step, we are collecting user stories.  If you have any, such as “I want to find out all the test results for the devices in my lab”, feel free to reply to this thread:
https://groups.io/g/kernelci/topic/rfc_dashboards/77367531
“RFC: dashboards, visualization and analytics for test results”

Q: What is the relationship between KernelCI project and LAVA project? Does KernelCI have non-upstream changes to LAVA? Do LAVA people participate in KernelCI?

LAVA is used in many test labs that provide results to KernelCI, but KernelCI doesn’t run any labs itself. Some people do contribute to both, as KernelCI is one of the biggest public use-cases of LAVA, but they really are independent projects. The core KernelCI tools are designed to facilitate working with LAVA labs, but this is not a requirement and other test lab frameworks are also used.

Q: Is there any documentation on how to write those “custom” tests and to integrate it with KernelCI? (e.g. the SCSI drivers/storage devices you just mention before)

See Khouloud Touils’ talk Let’s Test with KernelCI with some hands-on examples.

There is also the user guide as part of the KernelCI documentation:
https://github.com/kernelci/kernelci-core/blob/master/doc/kci_testsuite.md

Each test is a bit different as they all have their own dependencies and are written in various languages. Typically, they will require a user-space image with all the required packages installed to be able to run as well as the latest versions of some test suites built from source. This is the case with v4l-utils, igt-gpu-tools or LTP. Some are plain scripts and don’t depend on anything in particular, such as bootrr.

When prototyping some new tests to run in LAVA, the easiest approach is to use nfsroot with the plain Debian Buster image provided by KernelCI and install extra packages at runtime, before starting the tests. Then when this is working well, dependencies and any data files can be baked into a fixed rootfs image for performance and reproducibility.

Q: How to properly deal with boards which are able to boot only from a mass-storage device and prevent them from being stuck with a non-working image?

In order to be useful with KernelCI, it’s required to at least be able to dynamically load the kernel image as well as any modules and device tree with a ramdisk for the tests that fit in a small enough image. If this can’t be done, then the kernel and user-space images need to be written to the persistent storage before each job. It might also be possible to load the kernel over TFTP and then extract the image onto the persistent storage and use it as a chroot. Ultimately this is the lab’s responsibility and it will depend on many things. If the kernel and the user-space can’t be changed at all, or if there is a possibility of bricking the device, then it’s basically not practical to do any CI on such a platform.

Let’s Test with KernelCI

Khouloud Touil, Baylibre

A growing number of Linux developers want to use KernelCI to run their test suites, but there’s a bit of a learning curve for how to make test suites work with KernelCI. “Let’s Test with KernelCI” will give an overview of the ways to integrate test suites and/or test results into the KernelCI modular pipeline.

See the full Event description for more details. Below are a list of Q&A gathered from the session.

Q: Is there also support for custom YP/OE distros or is it currently limited to the usage of predefined kernels and file systems?

The kernels are all built with regular “make”, not any packaging or yocto recipe is supported right now. However, that could be done with a bit of plumbing. Then for user-space, kernelci only really tests the kernel: the buildroot and debian images are only there to be able to run kernel tests. If you create your own KernelCI instance, you can run tests with your own user-space built using Yocto and extend testing to cover some user-space if you want.

Q: Is there some kind of test config to require a certain kernel flag active? I am basically thinking about running some pre-defined test base, based on my own kernel config and then report back the results with something like “ran test X, which requires kernel config flag Y, on architecture/platform Z on kernel version V”.

Yes, there are a couple of ways to adjust the kernel config on kernelci.org. One way is with a special syntax like defconfig+CONFIG_SOMETHING=y. Another way is to define a config fragment. Each KernelCI test result will have the information you mentioned as meta-data.

Q: Which firewall streams must be permitted in order for KernelCI to use a custom Lab? I mean if we want to contribute a lab (with associated boards) to KernelCI.org?

LAVA exposes a REST API over HTTPS. It’s also possible to have the LAVA server hosted publicly and using LAVA dispatchers in a private network which will be connecting to the server as clients, with no incoming connections.

When not using LAVA, you can also periodically poll storage.kernelci.org for new kernel builds to appear, and download them to test them then send results to api.kernelci.org. In this case, no incoming connections are required either.

Q: In real life how are tests that need to check hardware I/O done? For example in your audio playback case it’s probably not enough to run the play command but we want to check that something was actually played e.g. by capturing the output.

For audio (and video), some hardware has loopback devices which can be used to compare against expected output. For more advanced setup, labs can have external capture equipment as well. But this ends up to be lab-specific since there are many ways to do it.

Follow-up impromptu video discussion

As we neared the end of the time slot for the “Let’s Test with KernelCI” talk, we decided to start a public video call with anyone who was interested and attended the talk. We discussed various general things about the project, and a few notes and Q&A were captured:

Q: How can a test lab get added to kernelci.org?

This is something that would require better documentation. We can distinguish 3 different “levels” of integration for labs:

  1. LAVA-style: fully integrated into the pipeline
    If you have a LAVA lab, it’s the easiest way to contribute test results to KernelCI. It also enables automated bisection and is the most efficient way of getting tests run.
  2. Asynchronous test lab
    If you have a test lab with no way to receive requests to run tests, you can look for kernelci.org builds to appear and submit results with kci_data. A typical example is Labgrid. One way to improve this is to implement some notification protocol so these labs could avoid polling and get requests to run tests like the LAVA labs.
  3. Autonomous CI system: KCIDB
    With options 1 and 2, tests use kernel builds from kernelci.org and report results to the same backend.  This is called the “native” KernelCI tests. Option 3 is for full CI systems creating their own kernel builds and running their own set of tests. The results are sent to the common reporting database using the KCIDB tools.

Q: Where can we find the source and definition of tests visible on kernelci.org frontend?

This is also something that would require better documentation, with a directory of all the test plans and how they are created. Functional tests are fairly recent on kernelci.org, which is why we don’t have that yet.

All the tests are normally defined in the kernelci-core repository. This includes building some test suites from source and including them in user-space rootfs images, and defining how to run the tests.

User story: Checking results for devices in “my” lab across all the branches and revisions.

KernelCI Notes from Plumbers 2020

By Blog, Community

The Linux Plumbers Conference 2020 was held as a virtual event this year. The online platform provided a really good experience, with talks and live discussions using Big Blue Button for the video and Rocket Chat for text-based discussions. KernelCI was mentioned many times in several micro-conferences, with two talks in Testing & Fuzzing which are now available on YouTube:

The notes below were gathered publicly from a number of attendees, they give a good insight into what was discussed. In short, while there is still a lot to be done, the KernelCI project is healthy and growing well in its role of a central CI system for the upstream Linux kernel.

Real-Time Linux

We’ve been making great progress with running LAVA jobs using the test-definitions repository from Linaro, thanks to Daniel Wagner’s help in particular. This was prompted by the discussions in the real-time micro-conference.

The next steps from a KernelCI infrastructure point of view is to be able to detect performance regressions, as these are different to binary pass/fail results. KernelCI can already handle measurements, but not yet compare them to detect regressions. Real-time getting merged upstream means it is becoming increasingly important to be able to support this.

There was also an interesting talk about determining the scheduler latency when using RT_PREEMPT and the introduction of a new tool “rtsl” to trace real-time latency.  This might be an interesting area to investigate and potentially run automated tests with:

Static Analysis

The topic of static analysis and CI systems came up during the Kernel Dependability MC, and in particular, they were looking for a place to do “common reporting” in order to collect results for the various types of static analysis and checkers available.  We pointed them to the KernelCI common reporting talks/BoFs.

Some static analysis can also be done by KernelCI “native” tests using the kernelci.org Cloud infrastructure via Kubernetes, which is currently only used to build kernels. This is probably where KUnit and devicetree validation will be run, but the rest still needs to be defined.

KCIDB

Fuego

Tim Bird, the main developer of Fuego at SONY, started joined the KCIDB BoF  and we had a good discussion. Unfortunately he had not enough time to go through to an actual submission. We got about a quarter way through converting his mock data to KCIDB.

Gentoo Kernel CI

Alice Ferrazzi, maintainer of GKernelCI at Gentoo, had more time available for the KCIDB BoF and we talked through getting the data out of her system. A mockup of her data was made and successfully submitted to the KCIDB playground database setup.

Intel

Tim Orling, Yocto project architect at Intel, has expressed keen interest in KCIDB.  He said he would experiment at home and will push Intel internally to participate.

LLVM/Clang

The recently added support for “LLVM=1” upstream means we can now have better support for making Clang builds. In particular, this means we’re now using all the LLVM binaries and not just clang. It also solved the issue with merge_config.sh and the default CC=gcc in the top-level Makefile.

This was enabled in kernelci.org shortly after LPC.

kselftest

The first kselftest results were produced on staging.kernelci.org during Plumbers as a collective effort.  We have now started enabling them in production, so stay tuned as they should soon start appearing on kernelci.org.

Initial set of results: https://kernelci.org/test/job/next/branch/master/kernel/next-20200923/plan/kselftest/

AutoFDO

AutoFDO will hopefully get merged upstream, once it is it might be useful for CI systems to share profiling data from benchmarking runs in particular.

Building randconfig

The TuxML project carries out some research around Linux kernel builds: determining the build time, what can be optimised, which configurations are not valid… The project could benefit from the kernelci.org Cloud infrastructure to extend its build capacity while also providing more build coverage to kernelci.org. This could be done by detecting kernel configurations that don’t build or lead to problems that can’t be found with the regular defconfigs.

Using tuxmake

The goal of tuxmake is to provide a way to reproduce Linux kernel builds in a controlled environment. This is used primarily by LKFT, but it should be generic enough to cover any use-case related to building kernels. KernelCI uses its kci_build tool to generate kernel configurations and produce kernel builds with some associated meta-data. It could reuse tuxmake to avoid some duplication of effort and only implement the KernelCI-specific aspects.

KernelCI Community Survey Report

By Blog

We are thrilled to share with you the results of our first KernelCI Community Survey. It has been a very interesting experience, with just under 100 responses from people who all provided quality feedback. We are really thankful for every single one of them. It was also a great way to engage more widely with the community. The full results are available for everyone to see in a shared spreadsheet. Individual comments are not shared publicly although they are very valuable and will be taken into account.

Read More