Docs/System startup

From Apertis
Jump to: navigation, search

Contents

Apertis System Startup Overview

The Apertis boot process has a few major phases after initial power-on:

Bootloader
low-level disk reading and heuristics to determine which Mini Userspace to boot (or whether to perform a complete reset in case of extreme failure)
Mini Userspace
displays a welcome screen and warns in case boot cannot proceed. Continues into Full Userspace
Full Userspace
starts critical services and sets up the system to accept user input


The following diagram explains the general decision tree for the boot process, which is described in greater detail below.

Chaiwala-boot-phases.png

Critical Sequences During Startup

The phases during startup generally proceed from most- to least-critical. For instance, a deadlock in the kernel during the Mini Userspace portion would mean the system would never complete its boot process and thus would be non-responsive to user input. On the other hand, if the networking system (near the end of the boot process) failed to start, driving directions may be unavailable, but the user could still play music and other media.

Any failure during boot should not result in a worsened state. In fact, the factory reset fallback feature should ensure that many types of catastrophic system failures (such as extreme filesystem corruption) are automatically repaired upon subsequent boot. Due to Apertis's focus on making the core system storage read-only for the majority of system uptime, there should be few opportunities for defects to adversely affect the boot process.

How Apertis System Startup Differs From Other Linux Distributions

Apertis checks bootloader flags corresponding to its Mini Userspace images and Full Userspace volumes to determine which kernel and initial RAM disk image (initrd) to boot. These flags indicate whether each Mini Userspace/Full Userspace pair is likely to boot successfully. In case neither candidate is valid, the system will automatically reset itself to its factory-installed state and reboot.

These flags are checked by a modified version of the u-boot bootloader and updated during the bootloader execution, the Mini Userspace boot phase, and the Full Userspace phases (which a normal system spends almost all of its time in).

The Mini Userspace boot phase display a welcome screen animation which continues until the Full Userspace phase has proceeded far enough to display the initial user interface and react to user input.

Docs/video-animation-on-boot has information on how to modify the boot animation that is implemented as a Plymouth theme.

For more details, please refer to the System Updates and Rollback Proposal.

Systemd And Its Role In Startup

Systemd starts system services on-demand, serving as a replacement for the traditional SysV init system. In this duty, Systemd is the first process started by the kernel in the "Full Userspace" portion of boot. It starts all processes required to bring the system up to the point where it is ready for user input.

Unlike the traditional init system, Systemd can also handle automated activation of services, after the system has settled, based upon requirements of applications run due to user interaction. Further, Systemd can stop services when they are no longer needed.

As an implementation detail, Systemd starts services in a highly-parallel fashion, which allows them to make the most use of the processor while they wait on their dependencies. This reduces total start-up time by preventing services blocking input/output delays from blocking their dependent services (which would otherwise need to wait until the first service is done starting).

More information about systemd can be found on its manual pages.

Startup Sequence Modification Dos And Don'ts

Do

configure Systemd services to only depend upon other services they absolutely require to function
this helps simplify and shorten the boot process
ensure less-important services depend upon later stages of the Systemd startup phase
this allows the system to bring up critical functionality sooner, which can make the startup feel shorter from the user perspective

Don't

add arbitrary delays to services
Systemd supports a wide variety of ways to describe dependencies between services, making timing delays, which are fragile and slow the boot process, unnecessary
remove actual dependencies from .service files in an attempt to shorten the boot process
Systemd is very good at minimizing the impact of the sequencing of service activation on the total boot time, so it's more important to ensure that all of a service's requirements have at least started when it does

How To Profile And Optimize Startup

systemd has two tools that are useful to profile and optimise the system's startup time: systemd-analyze and systemd-bootchart

systemd-analyze

Is a tool to analyze a system boot performance, retrieve statistics in order to improve the system startup time. There are different arguments to retrieve information from the system and service manager.

Following are some useful options with the explanation taken from the systemd-analyze man page.

systemd-analyze time prints the time spent in the kernel before userspace has been reached, the time spent in the initial RAM disk (initrd) before normal system userspace has been reached, and the time normal system userspace took to initialize. Note that these measurements simply measure the time passed up to the point where all system services have been spawned, but not necessarily until they fully finished initialization or the disk is idle.

# systemd-analyze time
Startup finished in 1.412s (kernel) + 1.869s (initrd) + 14.158s (userspace) = 17.440s

systemd-analyze blame prints a list of all running units, ordered by the time they took to initialize. This information may be used to optimize boot-up times. Note that the output might be misleading as the initialization of one service might be slow simply because it waits for the initialization of another service to complete.

# systemd-analyze blame
          4.601s user@1000.service
          3.529s systemd-journal-flush.service
          2.760s plymouth-start.service
          2.025s avahi-daemon.service
          1.657s connman.service
          1.639s systemd-udev-trigger.service
          1.579s systemd-sysctl.service
          1.408s bootlogs.service
          1.341s dev-mqueue.mount
          1.119s systemd-tmpfiles-setup.service
          1.090s auditd.service
          1.050s motd.service
          1.032s sys-kernel-debug.mount
           979ms apparmor.service
           899ms systemd-modules-load.service
           884ms systemd-random-seed-load.service
           879ms systemd-tmpfiles-setup-dev.service
           809ms iptables.service
           784ms connman-vpn.service
           723ms systemd-fsck-root.service
           688ms ofono.service
           653ms plymouth-read-write.service
           632ms systemd-logind.service
           442ms run-lock.mount
           361ms run-user.mount
           307ms plymouth-quit.service
           292ms plymouth-quit-wait.service
           262ms systemd-remount-fs.service
           120ms rollbackd.service
            89ms systemd-user-sessions.service
            80ms systemd-udevd.service
            75ms systemd-update-utmp-runlevel.service

These numbers provide the relative duration of each service during the boot process (though their absolute time is only accurate if this is run on the target hardware).

systemd-analyze options can be used in conjunction with the --user argument which shows the performance data for user session instead of the system manager. For example the user@1000.service unit starts a set of services so a finer grained information can be gathered with:

# systemd-analyze --user blame
           3.321s pulseaudio.service
           1.687s xorg.service
           1.408s tracker-miner-fs.service

systemd-analyze critical-chain [UNIT...] prints a tree of the time-critical chain of units (for each of the specified UNITs or for the default target otherwise). The time after the unit is active or started is printed after the "@" character. The time the unit takes to start is printed after the "+" character. Note that the output might be misleading as the initialization of one service might depend on socket activation and because of the parallel execution of units.

# systemd-analyze critical-chain
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

graphical.target @13.871s
└─multi-user.target @13.871s
  └─user@1000.service @9.268s +4.601s
    └─systemd-user-sessions.service @9.159s +89ms
      └─basic.target @6.344s
        └─systemd-ask-password-plymouth.path @6.343s
          └─-.mount @523ms

systemd-analyze plot prints an SVG graphic detailing which system services have been started at what time, highlighting the time they spent on initialization.

# systemd-analyze plot > plot.svg
# eog plot.svg

Systemd-analyze-plot.svg

systemd-bootchart

bootchart is a tool that generates a SVG graph with information about processes resources utilization (cpu, memory and I/O).

It can be executed at any time but is usually ran at boot time to collect information during the boot process in order to analyse and optimise it.

systemd-bootchart can be executed at boot time by passing init=/lib/systemd/systemd-bootchart to the kernel command line.

There are different options that can be passed to bootchart to tune its execution. These can also be set on /etc/systemd/bootchart.conf. Refer to the systemd-bootchart man page for information about the different options.

To change the kernel command line on a Apertis image, on boot press any key from the serial console to start its command prompt and enter the following commands:

MX6QSABRELITE U-Boot > setenv bootargs ${bootargs} init=/lib/systemd/systemd-bootchart
MX6QSABRELITE U-Boot > boot

systemd-bootchart will be invoked as the init process and it will in turn fork and execute the real init so the system can be booted as normal while bootchart collect the system information during the boot process.

After a few seconds (20 by default but configurable in bootchart.conf) the data collection stops and the graph is generated. A message similar to the following will be shown:

systemd-bootchart wrote /run/log/bootchart-19700101-0000.svg

Systemd-bootchart.svg

Improvements made to speedup boot time

Patch PulseAudio to support systemd socket activation

A patch that adds socket activation support was posted on PulseAudio mailing list but was not merged due issues pointed in response. Those issues were addressed and the updated patch backported to PulseAudio 3.0 so it can be applied to the Apertis PulseAudio package.

Before this change these are the numbers shown by systemd-analyze:

# systemd-analyze time
Startup finished in 1.414s (kernel) + 2.059s (initrd) + 14.437s (userspace) = 17.910s
# systemd-analyze blame | grep user
          4.891s user@1000.service
# systemd-analyze critical-chain
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

graphical.target @14.139s
└─multi-user.target @14.139s
  └─user@1000.service @9.247s +4.891s
    └─systemd-user-sessions.service @9.066s +99ms
      └─basic.target @6.243s
        └─systemd-ask-password-plymouth.path @6.243s
          └─-.mount @547ms
$ systemd-analyze blame --user
          4.085s pulseaudio.service
          3.159s tracker-miner-fs.service
          2.602s xorg.service
          1.363s tracker-store.service
           791ms flagtool.service
           203ms xdg-user-dirs-update.service

And after upgrading PulseAudio with systemd socket activation support:

# systemd-analyze time
Startup finished in 1.401s (kernel) + 1.892s (initrd) + 13.181s (userspace) = 16.475s
# systemd-analyze blame | grep user
          3.765s user@1000.service
# systemd-analyze critical-chain
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

graphical.target @12.913s
└─multi-user.target @12.912s
  └─user@1000.service @9.146s +3.765s
    └─systemd-user-sessions.service @9.017s +69ms
      └─basic.target @6.252s
        └─systemd-ask-password-plymouth.path @6.252s
          └─-.mount @523ms
$ systemd-analyze blame --user
          3.083s tracker-miner-fs.service
          2.144s xorg.service
          1.460s pulseaudio.service
           895ms tracker-store.service
           421ms flagtool.service
           372ms xdg-user-dirs-update.service

So there is a noticeable improvement on startup since user@1000 service does not depend on pulseaudio.service (PulseAudio being started) any more but only on pulseaudio.socket (the PulseAudio UNIX socket created by systemd).

Use systemd timer-based activation for tracker-miner-fs and tracker-store services

Another service that slowed down the user session startup and hence boot time was tracker-miner-fs service. Is not necessary for this service to be part of the critical chain but still has to be started early. So systemd timer unit files were used to delay the services execution on boot.

With that change the boot time improved considerably as shown by systemd-analyze

# systemd-analyze time
Startup finished in 1.118s (kernel) + 1.528s (initrd) + 10.890s (userspace) = 13.536s
# systemd-analyze blame | grep user
          2.311s user@1000.service
# systemd-analyze critical-chain
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

graphical.target @10.658s
└─multi-user.target @10.658s
  └─user@1000.service @8.345s +2.311s
    └─systemd-user-sessions.service @8.243s +89ms
      └─basic.target @5.959s
        └─systemd-ask-password-plymouth.path @5.959s
          └─-.mount @593ms
$ systemd-analyze blame --user
          1.819s xorg.service
          1.622s tracker-miner-fs.service
          1.462s pulseaudio.service
           386ms tracker-store.service
           380ms flagtool.service
           224ms xdg-user-dirs-update.service

There is a noticeable improvement on startup since user@1000 service does not depend on tracker-{miner-fs,store}.service any more but only on tracker-{miner-fs,store}.timer (the systemd timer unit files that delay the execution of these services).

Personal tools
Namespaces

Variants
Actions
Navigation
Tools