Author | Nejat Hakan |
nejat.hakan@outlook.de | |
PayPal Me | https://paypal.me/nejathakan |
Process Management
Introduction What is a Process
Welcome to the world of Linux process management! This is a fundamental area of any operating system, and understanding it is crucial for system administrators, developers, and power users alike. Before we dive deep, let's clarify the most basic concept: What exactly is a process?
Often, people use the terms "program" and "process" interchangeably, but in operating systems terminology, they have distinct meanings.
-
Program: A program is a passive entity. It's a collection of instructions and associated data stored on a disk (or other storage medium) in an executable file format (like ELF - Executable and Linkable Format - on Linux). Think of it as a recipe written down in a cookbook. It exists, it has instructions, but it's not doing anything on its own. Examples include the
/bin/bash
executable file, the/usr/bin/firefox
executable, or a compiled C program you wrote. -
Process: A process is an active instance of a program being executed. It's the recipe being actively cooked in the kitchen. When you run a program, the Linux kernel loads the program's instructions and data into memory, allocates system resources, and begins executing the instructions. This active entity, with its own memory space, resources, and execution state, is a process. You can have multiple processes running the same program concurrently (e.g., multiple terminal windows each running the
bash
program result in multiplebash
processes).
Key Components of a Process
When the kernel creates a process, it associates several key components and pieces of information with it, managed within the kernel's data structures (often referred to conceptually as the Process Control Block or PCB, though Linux uses task_struct
):
- Process ID (PID): A unique positive integer assigned by the kernel to identify the process. PID 1 is special; it's typically the
init
orsystemd
process, the ancestor of most user-space processes. New PIDs are usually assigned sequentially, wrapping around when the maximum value is reached (configurable, often 32768 or higher). - Parent Process ID (PPID): The PID of the process that created this process. Processes form a hierarchy, and the PPID links a child process back to its parent.
- Process State: Indicates what the process is currently doing (e.g., running, sleeping, stopped). We'll cover states in detail later.
- Program Counter (PC): Stores the memory address of the next instruction to be executed for this process.
- CPU Registers: Contains the values of the processor's registers when the process was last running. These need to be saved when the process is interrupted and restored when it resumes.
- Memory Management Information: Details about the process's virtual address space, including pointers to its code, data, stack segments, and page tables used by the kernel to map virtual to physical memory. Each process typically gets its own private virtual address space, providing memory protection.
- User ID (UID) and Group ID (GID): Identifies the user and group that own the process. These IDs are used by the kernel to determine the process's permissions for accessing files and other system resources. Effective UID/GID and Real UID/GID can sometimes differ, related to concepts like
setuid
programs. - Open File Descriptors: A list of files (including network sockets, pipes, etc.) that the process currently has open. Represented as small non-negative integers (0 for stdin, 1 for stdout, 2 for stderr by convention).
- Scheduling Information: Includes the process's priority, consumed CPU time, and other data used by the kernel's scheduler to decide when and for how long the process should run on the CPU.
- Signal Mask and Pending Signals: Information about which signals the process is blocking and which signals have been sent to it but not yet delivered or handled.
In essence, a process is much more than just program code; it's the program in execution, bundled with all the context, resources, and kernel-managed information required for it to run and interact with the system. Understanding this distinction and the components involved is the first step towards mastering process management in Linux.
Workshop Identifying Your Shell Process
Let's start with a simple, practical task: identifying the process associated with the very terminal shell you are likely using right now.
Objective: Find the Process ID (PID) and Parent Process ID (PPID) of your current shell.
Steps:
-
Open Your Terminal: If you don't already have one open, launch a terminal window. You are now interacting with a shell process (likely
bash
,zsh
, or similar). -
Identify the Shell's PID: The shell itself provides a special variable
$$
that expands to its own PID. Use theecho
command to display it:- Explanation:
echo
is a command that prints its arguments to the standard output.$$
is a special parameter in most shells that the shell replaces with its own Process ID before executing the command. - Output: You will see a number, for example,
12345
. This is the PID of your current shell process. Note this number down.
- Explanation:
-
Use
ps
to Verify and Find the PPID: Theps
command is the primary tool for viewing processes. We'll use it to find the process with the PID you just identified and look at its details, including the PPID.- Replace
<PID>
with the actual number you got fromecho $$
. For example, if your PID was12345
, you'd run:ps -p 12345 -o pid,ppid,cmd
- Explanation:
ps
: The command to report a snapshot of current processes.-p <PID>
: This option tellsps
to only show information about the process with the specified PID.-o pid,ppid,cmd
: This option specifies a custom output format. We are askingps
to show only thepid
(Process ID),ppid
(Parent Process ID), andcmd
(the command being run, possibly truncated).
- Output: You should see something like this (your numbers and command might differ):
This confirms the PID you found earlier and shows you the PPID. The
CMD
column shows the name of the program being executed (bash
in this example).
- Replace
-
(Optional) Find the Parent Process: Now you know the PPID. You can use
ps
again to see what that parent process is:- Replace
<PPID>
with the PPID you found in the previous step. - Output: You might see your terminal emulator program (like
gnome-terminal-server
,konsole
, etc.), or perhaps another shell if you started this shell from another one. This demonstrates the parent-child relationship.
- Replace
Summary: In this workshop, you learned how to find the unique identifier (PID) of your current shell process using the $$
variable and confirmed it using the ps
command. You also discovered the PID of its parent process (PPID), illustrating the process hierarchy.
1. Process States and Lifecycle
A process doesn't just run constantly; it transitions through various states during its lifetime, reflecting its current activity and interaction with the kernel and hardware. Understanding these states is crucial for diagnosing performance issues and managing system resources.
The Process Lifecycle
A typical process lifecycle involves the following phases:
- Creation: A process is born when an existing process makes a
fork()
system call. Thefork()
call creates a near-identical copy of the calling process (the parent), resulting in a new process (the child). The child inherits many attributes from the parent but gets its own unique PID. Often, the child process will then use anexec()
family system call to replace its memory image with a new program, effectively starting a different task. - Scheduling: Once created, the process enters the pool of runnable processes. The kernel's scheduler decides which runnable process gets to use the CPU next based on scheduling algorithms and process priorities.
- Execution: The process runs on the CPU, executing its instructions. During execution, it might:
- Run until its allotted time slice expires.
- Voluntarily give up the CPU by making a blocking system call (e.g., reading from a file, waiting for network input).
- Be preempted by the scheduler if a higher-priority process becomes ready.
- Blocking/Sleeping: If a process needs to wait for an event (e.g., I/O completion, a signal, availability of data), it enters a sleeping or blocked state. It won't be considered for CPU execution until the event it's waiting for occurs.
- Termination: A process can terminate in several ways:
- Normal Exit: The process completes its task and calls the
exit()
system call (often invoked implicitly whenmain()
returns in C programs). It provides an exit status code (0 typically indicates success, non-zero indicates an error). - Error Exit: The process encounters an unrecoverable error and terminates itself, usually with a non-zero exit status.
- Killed by Signal: The process receives a signal that causes it to terminate (e.g.,
SIGTERM
,SIGKILL
).
- Normal Exit: The process completes its task and calls the
Common Process States in Linux
The ps
command often displays a state code (e.g., in the STAT
or S
column). Here are the most common states:
-
R
(Running or Runnable):- Running: The process is currently executing instructions on a CPU core.
- Runnable: The process is ready to run and is waiting in a queue for the scheduler to assign it a CPU core. From the perspective of
ps
, both are usually shown asR
because the state can change very rapidly. A process consuming significant CPU time will likely show asR
.
-
S
(Interruptible Sleep):- The process is waiting for an event to complete or a resource to become available. Examples include waiting for terminal input, data from a network socket, or a timer to expire.
- It's "interruptible" because it can be woken up prematurely by a signal.
- This is a very common state for processes that are not actively doing computation (e.g., idle shells, web servers waiting for requests).
-
D
(Uninterruptible Sleep):- Similar to
S
, the process is waiting, typically for I/O operations (like reading from or writing to a disk) to complete directly with hardware. - It's "uninterruptible" because it will not respond to most signals while in this state. This is necessary to prevent data corruption if the process were interrupted during a critical hardware interaction.
- Processes stuck in the
D
state for long periods can indicate hardware problems or driver issues (especially with storage or network interfaces). Killing a process in stateD
is often impossible without resolving the underlying I/O issue or rebooting.
- Similar to
-
T
(Stopped or Traced):- Stopped: The process has been suspended, usually by receiving a specific signal like
SIGSTOP
orSIGTSTP
(the latter is typically generated when you pressCtrl+Z
in the terminal). It will not run until it receives aSIGCONT
signal. - Traced: The process is being monitored by another process, such as a debugger (like
gdb
). Execution is suspended each time the debugger needs to inspect it.
- Stopped: The process has been suspended, usually by receiving a specific signal like
-
Z
(Zombie or Defunct):- The process has terminated (it called
exit()
or was killed), but its entry in the kernel's process table still exists. - Why? Because it contains information, primarily the process's exit status code, that the parent process might need to collect.
- The parent process is expected to call a
wait()
family system call to "reap" the zombie child. This call retrieves the child's exit status and allows the kernel to finally remove the zombie process entry. - A small number of zombie processes existing for a very short time is normal. However, if a parent process fails to call
wait()
(e.g., due to poor programming or the parent crashing), its zombie children will remain indefinitely until the parent terminates (at which point they might be reparented toinit
/systemd
, which does reap zombies) or the system reboots. - Zombies consume very few resources (mainly just the process table slot), but a large accumulation could potentially exhaust the PID space or process table slots. You cannot kill a zombie process directly (it's already dead); you must fix or kill the parent process that is failing to reap it.
- The process has terminated (it called
State Modifiers (Often seen appended to the main state character in ps
output):
<
: High-priority process (negative nice value).N
: Low-priority process (positive nice value).L
: Pages locked into memory (often for real-time operations).s
: Session leader (a process that started a new session).l
: Multi-threaded process (using POSIX threads).+
: Process is in the foreground process group of its controlling terminal.
Understanding these states helps you interpret the output of monitoring tools like ps
and top
, allowing you to quickly assess what processes are doing on your system.
Workshop Observing Process States
In this workshop, we'll create processes and manipulate them to observe different states using the ps
command.
Objective: Observe processes in the Running (R
), Interruptible Sleep (S
), Stopped (T
), and Zombie (Z
) states (though creating a persistent Zombie can be tricky without specific code).
Tools: ps
, sleep
, kill
, shell job control (Ctrl+Z
, bg
, fg
).
Steps:
-
Observe Interruptible Sleep (
S
):- The
sleep
command simply pauses for a specified number of seconds. It spends most of its time waiting for a timer signal, putting it into interruptible sleep. - Open a terminal and run:
- Explanation:
sleep 600
tells the command to pause for 600 seconds (10 minutes). The&
puts the command into the background, so you immediately get your shell prompt back. The shell will print the job number and PID of the background process. Note the PID.
- Explanation:
- Now, use
ps
to check its state. Use the PID reported when you started the background job.- Replace
<PID>
with the actual PID of thesleep
process.
- Replace
- Output: You should see something like:
The
S
indicates it's in Interruptible Sleep, waiting for the timer.
- The
-
Observe Stopped (
T
):- We need a foreground process to stop with
Ctrl+Z
. Let's usesleep
again, but this time in the foreground. - In the same terminal, run:
- Your terminal will now pause.
- Press
Ctrl+Z
. - Explanation:
Ctrl+Z
sends theSIGTSTP
signal (Terminal Stop) to the foreground process. - Output: The shell usually prints a message like
[1]+ Stopped sleep 300
. It also gives you your prompt back. - Find the PID of this stopped
sleep
process. You can use thejobs
command:- This lists background/stopped jobs with their PIDs. Find the
sleep 300
job and note its PID.
- This lists background/stopped jobs with their PIDs. Find the
- Check its state with
ps
:- Replace
<PID>
with the PID of the stoppedsleep 300
process.
- Replace
- Output:
The
T
indicates it's Stopped. - Resume the process: You can resume it in the background with
bg %<job_number>
(e.g.,bg %1
) or in the foreground withfg %<job_number>
. If you resume it in the background, check its state again withps
– it should return toS
. Let's bring it to the foreground and terminate it.
- We need a foreground process to stop with
-
Observe Running (
R
):- Creating a process that consistently stays in the
R
state usually requires a CPU-intensive task. A simple infinite loop can demonstrate this. - Run the following command in the background:
- Explanation: The
yes
command continuously outputs 'y' (followed by a newline) very rapidly. We redirect its output (>
) to/dev/null
(a special file that discards everything written to it) so it doesn't flood your terminal. The&
puts it in the background. Note the PID.
- Explanation: The
- Quickly check its state with
ps
(it might share the CPU, so it might flip betweenR
andS
, but should showR
frequently):- Replace
<PID>
with the PID of theyes
process. We added%cpu
to see CPU usage.
- Replace
- Output:
You should see state
R
and high CPU usage. - Terminate the process: Find its PID (if you didn't note it) using
ps aux | grep yes
orpgrep yes
, then kill it forcefully:
- Creating a process that consistently stays in the
-
Observing Zombie (
Z
):- Reliably creating a long-lived zombie usually requires a custom program where the parent process forks a child, the child exits immediately, and the parent deliberately doesn't call
wait()
. Doing this purely from the shell is difficult. - However, you can look for existing zombie processes on your system (there might not be any).
- Use
ps
to find processes in theZ
state:- Explanation:
ps aux
lists all processes with user-oriented format.grep ' Z '
filters for lines containingZ
(with spaces to better match the state column and avoid matching other things).
- Explanation:
- Output: If you have zombie processes, you'll see output like:
The
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND someuser 14001 0.0 0.0 0 0 ? Z 10:30 0:00 [someprocess] <defunct>
<defunct>
tag is characteristic of zombie processes inps
output. Note the stateZ
. If you see this, the parent process (whose PID is the PPID of the zombie) needs to be investigated or potentially killed/restarted to clean it up.
- Reliably creating a long-lived zombie usually requires a custom program where the parent process forks a child, the child exits immediately, and the parent deliberately doesn't call
Summary: In this workshop, you used common Linux commands (sleep
, yes
) and shell features (&
, Ctrl+Z
) to create processes in different states (S
, T
, R
). You then used ps
with specific options to observe and verify these states. You also learned how to search for problematic Zombie (Z
) processes. This practical experience reinforces the theoretical understanding of the process lifecycle.
2. Viewing Processes
Being able to inspect the processes running on your Linux system is a fundamental skill. It allows you to monitor resource usage (CPU, memory), identify applications, troubleshoot problems, and manage system performance. Several command-line tools are available for this purpose, with ps
, top
, and htop
being the most common.
The ps
Command
ps
(Process Status) provides a snapshot of the currently running processes at the moment you execute the command. It's highly flexible due to its numerous options.
Common ps
Usage Styles:
There are two main styles of options for ps
, stemming from its history:
-
BSD Style: Options are used without a leading dash (
-
). Often combined.ps aux
: Probably the most common invocation.a
: Show processes for all users (not just your own).u
: Display user-oriented format (shows owner, CPU%, MEM%, etc.).x
: Show processes not attached to a terminal (daemon processes).
- Output Columns (Typical for
aux
):USER
: The user owning the process.PID
: Process ID.%CPU
: Approximate percentage of CPU time used by the process since it started (can be misleading for short-lived processes) or averaged over a small interval depending on the kernel/ps version.%MEM
: Approximate percentage of physical memory (RAM) used by the process.VSZ
: Virtual Memory Size (in KiB). Total virtual address space used by the process.RSS
: Resident Set Size (in KiB). The portion of the process's memory currently held in physical RAM (non-swapped).TTY
: Controlling terminal associated with the process.?
usually means no controlling terminal (daemon).STAT
: Process state code (e.g.,R
,S
,T
,Z
, plus modifiers likes
,+
,l
).START
: Time the process started.TIME
: Cumulative CPU time consumed by the process (user + system time). Format is oftenHH:MM:SS
orMM:SS.ms
.COMMAND
(orCMD
): The command that launched the process (might be truncated or show kernel thread names in brackets).
-
System V / UNIX Style: Options are preceded by a single dash (
-
).ps -ef
: Another very common invocation, often preferred by those with a System V background.-e
: Show every process on the system. Equivalent toa
+x
in BSD style.-f
: Full format listing. Shows UID, PID, PPID, C (CPU usage, but often less granular than%CPU
), STIME (start time), TTY, TIME, CMD.
ps -eF
: Extra full format (adds SZ, RSS, PSR - processor).ps -ejH
: Show processes in a hierarchy (like a tree).j
is job format,H
shows the hierarchy.ps -eL
: Show threads (Light Weight Processes, LWP) as separate entries. Each thread gets its own LWP ID but shares the same PID.
Customizing ps
Output:
The -o
option (or o
in BSD style) allows you to specify exactly which columns you want to see.
ps -p <PID> -o pid,ppid,user,%cpu,%mem,stat,start,time,cmd
: Show specific columns for a given PID.ps axo pid,comm,pcpu,pmem --sort=-pcpu | head -n 10
: Show PID, command name, CPU%, Mem% for all processes, sorted by CPU usage (descending), and show only the top 10.
The top
Command
top
provides a dynamic, real-time view of the running system, focusing on resource-intensive processes. It refreshes the display every few seconds (default is usually 3 seconds).
top
Interface:
-
Summary Area (Top Lines):
- System time, uptime, number of users logged in.
- Load average: System load over the last 1, 5, and 15 minutes (a measure of how many processes are running or waiting to run).
- Tasks: Total number of processes, broken down by state (running, sleeping, stopped, zombie).
- CPU States (
%Cpu(s)
): Percentage of CPU time spent in various states (us: user, sy: system, ni: nice, id: idle, wa: I/O wait, hi: hardware interrupts, si: software interrupts, st: steal time - for VMs). - Memory Usage (
KiB Mem
,KiB Swap
): Total, free, used, buff/cache for physical memory (RAM) and swap space.
-
Process List Area:
- A list of processes, typically sorted by CPU usage by default.
- Columns are similar to
ps
(PID, USER, PR [Priority], NI [Nice value], VIRT [Virtual Mem], RES [Resident Mem], SHR [Shared Mem], S [State], %CPU, %MEM, TIME+ [CPU Time, higher precision], COMMAND).
Interactive top
Commands (Press while running):
h
or?
: Display help screen.q
: Quittop
.Space
: Force an immediate refresh.M
: Sort by memory usage (%MEM
).P
: Sort by CPU usage (%CPU
) (default).T
: Sort by cumulative CPU time (TIME+
).k
: Kill a process (prompts for PID and signal).r
: Renice a process (prompts for PID and nice value).u
: Filter by user (prompts for username).1
: Toggle display of individual CPU core stats (if multiple cores).f
: Enter field management screen to add/remove/reorder columns.o
: Enter filter screen to add criteria (e.g.,COMMAND=bash
).z
: Toggle color display.x
: Highlight sorting column.
The htop
Command
htop
is an enhanced, interactive process viewer often considered more user-friendly than top
. It provides a similar dynamic view but adds features like:
- Colorized display.
- Scrolling horizontally and vertically through the process list.
- Easier process manipulation (killing, renicing) often using function keys.
- Visual meters for CPU (per core), memory, and swap usage at the top.
- Tree view (press
F5
). - Direct process searching/filtering (press
F4
). - Setup menu (
F2
) for customization.
If htop
is not installed, you can usually install it using your distribution's package manager (e.g., sudo apt update && sudo apt install htop
on Debian/Ubuntu, sudo dnf install htop
on Fedora/CentOS/RHEL).
Choosing the Right Tool:
- Use
ps
when you need a specific snapshot, want highly customized output for scripting, or need to see process hierarchy easily (ps auxf
,ps -ejH
,pstree
). - Use
top
for a standard, real-time overview of system load and resource hogs, available on almost any Linux system. - Use
htop
for a more interactive, user-friendly real-time monitoring experience with easier navigation and built-in features like tree view and searching (if available).
Workshop Monitoring System Processes
In this workshop, you'll practice using ps
, top
, and htop
to explore the processes running on your system and identify resource usage.
Objective: Use process viewing tools to find specific processes, sort them by resource usage, and understand the output.
Prerequisites: htop
should ideally be installed (sudo apt install htop
or sudo dnf install htop
).
Steps:
-
Basic
ps
Exploration:- Open a terminal.
- Get a full listing of all processes in user-oriented format:
- Scroll through the list. Can you identify your shell process? Your desktop environment processes (e.g.,
gnome-shell
,plasmashell
)? System services (e.g.,systemd
,sshd
,cron
)?
- Scroll through the list. Can you identify your shell process? Your desktop environment processes (e.g.,
- Get a full listing showing the parent-child relationship:
- Look at the PID and PPID columns. Can you trace the lineage of a few processes back towards PID 1 (
init
orsystemd
)?
- Look at the PID and PPID columns. Can you trace the lineage of a few processes back towards PID 1 (
- Find only the processes owned by your user:
-
Using
ps
for Specific Information:- Find the PID of the main
systemd
process (PID 1): - List all processes and show only their PID, command name, CPU usage, and memory usage. Sort by memory usage (highest first) and show the top 5:
- List all processes and show PID, command name, CPU usage, and memory usage. Sort by CPU usage (highest first) and show the top 5:
- Find the PID of the main
-
Interactive Monitoring with
top
:- Start
top
: - Observe the summary area: Note the load average, CPU usage breakdown, and memory usage.
- Identify the top CPU-consuming processes (they are usually at the top by default).
- Sort by Memory Usage: Press
Shift+M
. Observe how the list reorders. Note the processes using the most RAM (%MEM
andRES
columns). - Sort back by CPU Usage: Press
Shift+P
. - Filter by Your User: Press
u
, type your username, and press Enter. Nowtop
only shows your processes. Pressu
, leave the prompt empty, and press Enter to show all users again. - Toggle Individual CPU View: Press
1
. Observe the%Cpu(s)
line change to show individual core statistics (%Cpu0
,%Cpu1
, etc.). Press1
again to toggle back. - Leave
top
: Pressq
.
- Start
-
Enhanced Monitoring with
htop
(If installed):- Start
htop
: - Compare the interface to
top
. Note the graphical meters and function key hints at the bottom. - Use the Arrow Keys (
Up
,Down
,Left
,Right
) to navigate the process list. - Sort by Memory: Press
F6
(SortBy), use arrow keys to selectPERCENT_MEM
, press Enter. - Sort by CPU: Press
F6
, selectPERCENT_CPU
, press Enter. - Show Tree View: Press
F5
(Tree). See how processes are nested under their parents. PressF5
again to toggle back to the sorted list. - Filter Processes: Press
F4
(Filter), type a command name (e.g.,bash
), press Enter. Only processes matching the filter are shown. PressF4
again, clear the filter, and press Enter to see all processes. - Search for a Process: Press
F3
(Search), type part of a command name, press Enter.htop
will jump to the next match. PressF3
again to find the next one. - (Optional - Be Careful!) Kill a safe process: Start
sleep 1000 &
in another terminal. Inhtop
, find thesleep
process (you might need to filter or search). Use the arrow keys to highlight it. PressF9
(Kill). The default signal is15 SIGTERM
. Press Enter to send it. Thesleep
process should disappear. If you startedyes > /dev/null &
earlier, you could try killing that withF9
and selecting signal9 SIGKILL
. - Access Setup: Press
F2
(Setup). Explore the options for customizing meters, display options, and columns. PressF10
(Done) to exit setup. - Leave
htop
: PressF10
(Quit) orq
.
- Start
Summary: This workshop provided hands-on experience with ps
, top
, and htop
. You learned how to get different views of the process list, extract specific information, sort processes based on resource consumption, and use the interactive features of top
and htop
for real-time monitoring and basic management tasks.
3. Process Creation
Understanding how new processes come into existence in Linux is fundamental. Unlike some operating systems, Linux primarily uses a combination of two distinct system calls: fork()
and exec()
.
The fork()
System Call
- Purpose: To create a new process.
- Mechanism: When a process (the parent) calls
fork()
, the kernel creates a nearly identical copy of that process (the child). This includes copies of:- The parent's virtual address space (code, data, stack). Initially, this is often done using Copy-on-Write (CoW). This means the physical memory pages are not actually copied immediately. Both parent and child initially share the same physical pages, marked as read-only. Only when one process attempts to write to a shared page does the kernel create a private copy of that specific page for the writing process. This makes
fork()
very efficient, especially if the child immediately callsexec()
. - CPU registers (state).
- Open file descriptors. Both parent and child initially share the same underlying open file table entries in the kernel. This means they point to the same position within a file, which can be important for coordination or lead to unexpected behavior if not managed.
- Signal handling settings.
- Current working directory.
- Resource limits.
- The parent's virtual address space (code, data, stack). Initially, this is often done using Copy-on-Write (CoW). This means the physical memory pages are not actually copied immediately. Both parent and child initially share the same physical pages, marked as read-only. Only when one process attempts to write to a shared page does the kernel create a private copy of that specific page for the writing process. This makes
- Return Value: This is how the parent and child differentiate themselves after the call:
- In the parent process,
fork()
returns the PID of the newly created child process. - In the child process,
fork()
returns 0. - If the
fork()
call fails (e.g., due to resource limits like exceeding the maximum number of processes), it returns -1 in the parent, and no child process is created.
- In the parent process,
- Execution: After a successful
fork()
, both the parent and child processes continue executing from the instruction immediately following thefork()
call. They use the return value to determine which code path to take (parent-specific or child-specific).
The exec()
Family of System Calls
- Purpose: To replace the current process image with a new program.
- Mechanism: When a process calls one of the
exec()
functions (likeexecl()
,execv()
,execle()
,execve()
,execlp()
,execvp()
), the kernel does the following:- Loads the specified program file into the calling process's address space, overwriting the existing code, data, and stack segments.
- Resets the process's execution state.
- Starts executing the new program from its entry point (
main()
function in C).
- Key Point: The
exec()
call, if successful, does not return. The original program code is gone, replaced entirely by the new program. The process itself continues to exist (it keeps the same PID, PPID, open files unless marked close-on-exec, etc.), but it's now running different code. Ifexec()
does return, it means an error occurred (e.g., program not found, permission denied). - Variants: The different
exec
functions vary in how they accept arguments:l
(e.g.,execl
,execlp
): Arguments are passed as a null-terminated list directly in the function call (const char *arg0
,const char *arg1
, ...,NULL
).v
(e.g.,execv
,execvp
): Arguments are passed as a null-terminated array of character pointers (char *const argv[]
).p
(e.g.,execlp
,execvp
): The system searches for the executable file in the directories specified by thePATH
environment variable if the filename doesn't contain a slash (/
).e
(e.g.,execle
,execve
): Allows specifying the environment variables for the new program image explicitly via an array (char *const envp[]
).execve
is the underlying system call that the others often wrap.
The Common Pattern: fork()
followed by exec()
This is the standard way shells and many other programs launch new applications in Linux/Unix:
- The parent process (e.g., your
bash
shell) wants to run a command (e.g.,ls -l
). - The parent calls
fork()
. Now there are two identical shell processes. - The parent process receives the child's PID from
fork()
. It might:- Wait for the child to complete using
wait()
orwaitpid()
(if running the command in the foreground). - Continue doing other things (if running the command in the background using
&
).
- Wait for the child to complete using
- The child process receives
0
fromfork()
. It knows it's the child. - The child process typically calls one of the
exec()
functions (e.g.,execvp("ls", ["ls", "-l", NULL])
). - The kernel replaces the child process's shell image with the
ls
program image. The child process (still with the same PID it got afterfork()
) now runsls -l
. - When
ls
finishes, it callsexit()
. - The parent process (if waiting) gets notified by the kernel via
wait()
, collects the child's exit status, and continues.
The Init Process (PID 1)
- The very first user-space process started by the kernel during boot has PID 1.
- Traditionally, this was the
init
process. In modern Linux distributions, it's usuallysystemd
. - PID 1 is the ultimate ancestor of almost all other user-space processes.
- It has special responsibilities, including:
- Starting essential system services and managing them.
- Adopting Orphaned Processes: If a parent process terminates before its children, those children become "orphaned." The kernel automatically reparents them to PID 1.
- Reaping Zombies: PID 1 is programmed to periodically call
wait()
to clean up any zombie processes that it has adopted (or any of its own direct children that become zombies). This prevents the long-term accumulation of zombies whose original parents died.
Understanding fork()
and exec()
explains why creating a new process doesn't automatically mean running a new program, and why running a new program usually involves creating a new process first. This model provides flexibility and resource efficiency (thanks to CoW).
Workshop Launching and Observing Child Processes
This workshop focuses on observing the fork
/exec
pattern in action using shell commands and process viewing tools.
Objective: Launch processes in the background and foreground, observe their PIDs and PPIDs, and see the parent-child relationship using ps
.
Tools: Shell (bash
, zsh
, etc.), sleep
, ps
.
Steps:
-
Identify Your Shell's PID:
- Remind yourself of your current shell's PID:
- Let's say your shell's PID is
20000
.
- Let's say your shell's PID is
- Remind yourself of your current shell's PID:
-
Run a Command in the Foreground:
- Execute a simple command:
- Conceptual Explanation:
- Your shell (PID
20000
) callsfork()
. A child shell process is created (e.g., PID20050
). - The parent shell (
20000
) waits. - The child shell (
20050
) callsexecvp("ls", ["ls", "/tmp", NULL])
. Its image is replaced byls
. - The
ls
program (still PID20050
) runs, lists/tmp
, and then callsexit()
. - The parent shell (
20000
) receives the exit status fromwait()
and gives you the prompt back.
- Your shell (PID
- Observation: This happens too fast to easily observe with
ps
. This just illustrates the underlying mechanism for standard command execution.
-
Run a Command in the Background:
- Now, run a command that takes some time, like
sleep
, in the background: - Output: The shell will likely print something like
[1] 20055
. This is the job number ([1]
) and the PID of the child process (20055
). - Conceptual Explanation:
- Your shell (PID
20000
) callsfork()
. A child shell process (PID20055
) is created. - The parent shell (
20000
) does not wait because of the&
. It immediately gives you the prompt back. - The child process (
20055
) callsexecvp("sleep", ["sleep", "30", NULL])
. Its image is replaced bysleep
. - The
sleep
program (PID20055
) runs for 30 seconds. When it finishes, it exits. The shell might print a "Done" message later when it reaps the child.
- Your shell (PID
- Now, run a command that takes some time, like
-
Observe the Background Process Parent:
- While the
sleep 30
command is still running (within 30 seconds), useps
to check its details, specifically its PPID. Use the PID reported in the previous step (20055
in our example): - Output:
- Notice that the
PPID
is20000
, which is the PID of your interactive shell. This confirms the parent-child relationship established byfork()
.
- Notice that the
- While the
-
Create a Process Chain (Subshell):
- Shells can also create subshells without necessarily executing an external command immediately. Parentheses
()
create a subshell. - Run the following:
- Explanation:
- The outer parentheses
( ... )
cause your main shell (20000
) tofork()
a child shell (let's say PID20060
). - This child shell (
20060
) then executes the command inside the parentheses:sleep 60 &
. - To do this, the child shell (
20060
) itself callsfork()
to create another child process (let's say PID20061
). - The parent subshell (
20060
) doesn't wait (due to&
) and exits almost immediately. - The grandchild process (
20061
) callsexecvp("sleep", ["sleep", "60", NULL])
and runssleep 60
.
- The outer parentheses
- Observation: Find the
sleep 60
process. Since the intermediate shell (20060
) likely exited very quickly, thesleep
process (20061
) might become orphaned. - Expected Output (Potentially):
PID PPID S CMD 20061 1 S sleep 60 # OR, if the original shell adopted it somehow (less common for this simple case): # 20061 20000 S sleep 60
- If the
PPID
is1
(or the PID ofsystemd
/init
), it means the intermediate parent shell (20060
) terminated, and thesleep
process (20061
) was orphaned and adopted by PID 1. This demonstrates orphan adoption.
- If the
- Shells can also create subshells without necessarily executing an external command immediately. Parentheses
-
Clean Up:
- You might still have
sleep
processes running. Find them and kill them:
- You might still have
Summary: In this workshop, you practiced launching commands in the background using &
. By examining the PID and PPID with ps
, you directly observed the parent-child relationship created by the shell's use of fork()
. You also explored how subshells and backgrounding can lead to orphaned processes being adopted by PID 1, illustrating another key aspect of the process lifecycle and hierarchy. This reinforces the concepts of fork
, exec
, parent/child relationships, and process adoption.
4. Process Termination and Signals
Processes don't run forever; they eventually terminate. Termination can be voluntary (the process finishes its work) or involuntary (it's forced to stop). Signals are the primary mechanism used in Linux/Unix systems for communicating events or requests, including termination requests, between processes or between the kernel and a process.
Process Termination
-
Normal Exit:
- A process typically terminates normally by calling the
_exit()
orexit()
system call (the standard C libraryexit()
function performs cleanup like flushing I/O buffers before calling_exit()
). - The process provides an exit status (an integer value between 0 and 255) to the kernel.
- Convention: An exit status of 0 indicates success. A non-zero exit status indicates failure or a specific error condition.
- The kernel stores this exit status in the terminated process's (now zombie) process table entry until the parent process retrieves it using a
wait()
family system call.
- A process typically terminates normally by calling the
-
Abnormal Termination (via Signals):
- A process can be terminated prematurely if it receives a signal whose default action is to terminate.
- Common terminating signals include
SIGTERM
,SIGKILL
,SIGINT
,SIGHUP
,SIGQUIT
,SIGSEGV
.
Signals
Signals are asynchronous software interrupts delivered to a process. They notify a process that a particular event has occurred.
-
Sources of Signals:
- Kernel: Notifying the process of hardware exceptions (e.g.,
SIGSEGV
- segmentation fault/invalid memory access,SIGFPE
- floating-point exception) or software events (e.g.,SIGPIPE
- writing to a pipe with no reader,SIGALRM
- timer expired). - Other Processes: A process with appropriate permissions can explicitly send a signal to another process using the
kill()
system call (or commands likekill
,pkill
). - Terminal Driver: When you press certain key combinations in the terminal (e.g.,
Ctrl+C
,Ctrl+Z
), the terminal driver sends signals (SIGINT
,SIGTSTP
, respectively) to the foreground process group.
- Kernel: Notifying the process of hardware exceptions (e.g.,
-
Signal Handling: When a signal is delivered to a process, the process can take one of three actions:
- Default Action: Every signal has a default action defined by the system. Common defaults include:
- Terminate the process.
- Terminate the process and dump core (create a file with the process's memory image for debugging).
- Ignore the signal.
- Stop the process.
- Continue the process (if stopped).
- Catch the Signal: The process can register a specific function (a signal handler) to be executed when the signal arrives. This allows the process to perform custom actions, like gracefully shutting down, reloading configuration, or cleaning up resources before exiting.
- Ignore the Signal: The process can explicitly request that the signal be ignored.
- Default Action: Every signal has a default action defined by the system. Common defaults include:
-
Exceptions: Two signals,
SIGKILL
(9) andSIGSTOP
(19), cannot be caught, blocked, or ignored by the process. They always perform their default action (terminate immediately and stop, respectively). This provides a reliable way for the kernel and system administrator to control processes.
Common Signals and Their Default Actions:
Signal Name | Number | Default Action | Description / Common Use |
---|---|---|---|
SIGHUP |
1 | Terminate | Hangup detected on controlling terminal or death of session leader. Often used to signal daemons to reload configuration. |
SIGINT |
2 | Terminate | Interrupt from keyboard (Ctrl+C ). Request to interrupt/stop. |
SIGQUIT |
3 | Terminate (Core) | Quit from keyboard (Ctrl+\ ). Similar to SIGINT but dumps core. |
SIGILL |
4 | Terminate (Core) | Illegal Instruction. CPU detected an invalid instruction. |
SIGABRT |
6 | Terminate (Core) | Abort signal. Usually sent by a process to itself on assertion failure. |
SIGFPE |
8 | Terminate (Core) | Floating-Point Exception (e.g., division by zero). |
SIGKILL |
9 | Terminate | Kill signal. Cannot be caught or ignored. Forceful termination. |
SIGUSR1 |
10 | Terminate | User-defined signal 1. Application specific use. |
SIGUSR2 |
12 | Terminate | User-defined signal 2. Application specific use. |
SIGSEGV |
11 | Terminate (Core) | Segmentation Violation. Invalid memory reference. |
SIGPIPE |
13 | Terminate | Broken pipe: write to a pipe with no process reading from it. |
SIGALRM |
14 | Terminate | Alarm clock timer expired (set by alarm() or setitimer() ). |
SIGTERM |
15 | Terminate | Termination signal. Standard, polite request to terminate. Can be caught/ignored. Preferred over SIGKILL. |
SIGCHLD |
17 | Ignore | Child process terminated, stopped, or continued. |
SIGCONT |
18 | Continue/Ignore | Continue if stopped, otherwise ignore. |
SIGSTOP |
19 | Stop | Stop signal. Cannot be caught or ignored. Pauses execution. |
SIGTSTP |
20 | Stop | Stop typed at terminal (Ctrl+Z ). Can be caught/ignored. |
You can get a full list on your system using kill -l
or man 7 signal
.
Commands for Sending Signals:
-
kill <PID>
- Sends a signal to the specified process ID.
- By default (if no signal is specified), it sends
SIGTERM
(15). - Syntax:
kill [-s <signal_name_or_number>] <PID> ...
orkill -<signal_number> <PID> ...
- Examples:
kill 12345
: SendsSIGTERM
to PID 12345 (polite request to terminate).kill -9 12345
: SendsSIGKILL
to PID 12345 (forceful termination).kill -SIGINT 12345
: SendsSIGINT
(same asCtrl+C
) to PID 12345.kill -SIGHUP 12345
: SendsSIGHUP
(often for config reload) to PID 12345.
-
pkill <pattern>
- Sends a signal to processes whose name (or other attributes) matches the provided pattern.
- Useful when you know the name of the process but not its exact PID, or want to signal multiple instances.
- By default, sends
SIGTERM
. - Syntax:
pkill [-signal] [-f] <pattern>
-f
: Match the pattern against the full command line, not just the process name.
- Examples:
pkill firefox
: SendsSIGTERM
to all processes namedfirefox
.pkill -9 troublesome_script.sh
: SendsSIGKILL
to processes namedtroublesome_script.sh
.pkill -f "python my_long_running_job.py"
: SendsSIGTERM
to processes whose command line contains this string.
-
killall <process_name>
- Similar to
pkill
, but typically matches only the exact process name. Behavior can sometimes differ slightly between implementations. - By default, sends
SIGTERM
. - Syntax:
killall [-signal] <process_name>
- Example:
killall -SIGKILL nginx
: SendsSIGKILL
to all processes exactly namednginx
.
- Similar to
Choosing Between SIGTERM
and SIGKILL
:
- Always try
SIGTERM
(15) first. This gives the process a chance to shut down cleanly: save its state, close files, clean up temporary resources, and terminate gracefully. This is the polite way. - Use
SIGKILL
(9) only as a last resort if a process is unresponsive toSIGTERM
or is causing severe problems.SIGKILL
terminates the process immediately without giving it any chance to clean up. This can lead to data corruption, orphaned resources, or inconsistent application state. It's the "pull the plug" approach.
Workshop Sending Signals to Processes
In this workshop, you'll practice sending different signals (SIGTERM
, SIGKILL
, SIGSTOP
, SIGCONT
) to processes using the kill
and pkill
commands.
Objective: Observe the effects of different signals on a running process. Understand the difference between SIGTERM
and SIGKILL
. Practice using kill
with PIDs and pkill
with names.
Tools: sleep
, yes
, kill
, pkill
, ps
, shell job control (Ctrl+Z
, jobs
, fg
, bg
).
Steps:
-
Prepare Target Processes:
- Open two separate terminal windows (or use terminal multiplexer panes like
tmux
). - In Terminal 1, start a simple
sleep
process in the background: - In Terminal 2, start a CPU-intensive
yes
process in the background:
- Open two separate terminal windows (or use terminal multiplexer panes like
-
Send
SIGTERM
(Polite Termination):- From either terminal, send
SIGTERM
to thesleep
process using its PID (replace30010
with the actual PID): - Switch back to Terminal 1 where
sleep
was started. You should see a message like[1]+ Terminated sleep 1000
after a brief moment. - Verify it's gone using
ps
orjobs -l
in Terminal 1. - Explanation:
sleep
is designed to handleSIGTERM
and exit cleanly.
- From either terminal, send
-
Send
SIGKILL
(Forceful Termination):- The
yes
process is likely still running and consuming CPU. Let's trySIGTERM
first (usingpkill
this time for variety). - Check if it terminated (use
jobs -l
in Terminal 2 orps aux | grep yes
). Some versions ofyes
might ignoreSIGTERM
or not terminate immediately. - If
yes
is still running, or to see the effect ofSIGKILL
, sendSIGKILL
: - Check again. The
yes
process should now be gone immediately. - Explanation:
SIGKILL
cannot be ignored and forces immediate termination by the kernel.
- The
-
Stop and Continue a Process:
- Start another
sleep
process in Terminal 1: - Send the
SIGSTOP
signal usingkill
(remember,SIGSTOP
cannot be caught): - Check its status in Terminal 1:
- Output: You should see the state indicated as
Stopped
.
- Output: You should see the state indicated as
- Verify with
ps
:- Output: Should show state
T
.
- Output: Should show state
- Now, send the
SIGCONT
signal to resume it: - Check its status again in Terminal 1:
- Output: The state should now be
Running
(meaning it's back in the background, likely sleeping).
- Output: The state should now be
- Verify with
ps
:- Output: Should show state
S
.
- Output: Should show state
- Clean up:
- Start another
-
Interactive Stop (
Ctrl+Z
) andSIGINT
(Ctrl+C
):- Start a foreground process in Terminal 1:
- Press
Ctrl+Z
. This sendsSIGTSTP
(Terminal Stop). The shell should indicate the job is stopped, and you get your prompt back. Check withjobs -l
. - Bring it back to the foreground:
fg %<job_number>
. - Now, press
Ctrl+C
. This sendsSIGINT
(Interrupt). Thesleep
command should terminate, and you get your prompt back. - Explanation: This demonstrates how terminal control keys map to specific signals (
SIGTSTP
,SIGINT
) sent to the foreground process.
Summary: In this workshop, you used kill
and pkill
to send signals (SIGTERM
, SIGKILL
, SIGSTOP
, SIGCONT
) to processes, observing their different effects. You saw the importance of trying SIGTERM
before resorting to SIGKILL
. You also practiced stopping and resuming processes using signals and related this to the familiar Ctrl+Z
and Ctrl+C
terminal actions. This practical experience solidifies your understanding of signal handling and process control.
5. Process Scheduling and Priorities
In any multitasking operating system like Linux, there are usually more runnable processes than available CPU cores. The scheduler is the core component of the kernel responsible for deciding which runnable process gets to use a CPU core, when, and for how long. Process scheduling aims to balance competing goals like fairness, responsiveness (for interactive tasks), and throughput (for batch tasks).
The Scheduler's Role
- Multiplexing: Share the CPU(s) among multiple processes over time, creating the illusion of simultaneous execution.
- Resource Allocation: Decide which process gets the valuable CPU resource next.
- Context Switching: When the scheduler decides to switch from one process (A) to another (B), it needs to:
- Save the complete execution context of process A (CPU registers, program counter, etc.).
- Load the previously saved execution context of process B.
- Resume execution of process B. This context switch has overhead and takes a small amount of time.
Scheduling Algorithms
Linux has employed various scheduling algorithms over its history. Modern kernels typically use sophisticated algorithms like the Completely Fair Scheduler (CFS) for normal (non-real-time) processes.
- CFS Goal: To distribute processor time fairly among runnable processes. It tries to give each process an "ideal" share of the CPU based on its priority (nice value). It uses a red-black tree data structure to efficiently find the process that has run the least relative to its fair share.
- Time Slices: Instead of fixed time slices, CFS assigns processes CPU time proportionally based on their weight (derived from their nice value). Processes that have received less than their fair share get priority.
- Real-time Scheduling: Linux also supports real-time scheduling policies (
SCHED_FIFO
,SCHED_RR
) for processes with strict timing requirements. These have higher priority than normal CFS processes and are scheduled differently (e.g., FIFO runs until it blocks or yields, RR runs for a fixed time slice before potentially yielding to another RR process of the same priority).
Process Priorities
Not all processes are equally important. Linux allows influencing the scheduler's decisions through process priorities. There are two main types:
-
Static Priority (Real-time):
- Used for real-time processes (
SCHED_FIFO
,SCHED_RR
). - Ranges from 0 to 99 (higher value = higher priority).
- Real-time processes always run before any normal CFS processes if they are runnable.
- Requires special privileges (root or
CAP_SYS_NICE
capability) to set.
- Used for real-time processes (
-
Dynamic Priority (Normal Processes / CFS):
- Used for regular user processes (
SCHED_NORMAL
,SCHED_BATCH
,SCHED_IDLE
). - Influenced by the nice value.
- Nice Value: An integer ranging from -20 to +19.
- -20: Highest priority (most "nice" to other processes, meaning it gets more CPU time relative to others). Requires root privileges to set negative values.
- 0: Default priority for processes started by normal users.
- +19: Lowest priority (most "nice" to the system, gets less CPU time when there's contention). Any user can increase the niceness (lower the priority) of their own processes.
- The kernel uses the nice value to calculate the process's weight within CFS. A lower nice value means a higher weight, resulting in a larger proportional share of the CPU time over the long run. It's a relative priority among CFS processes.
- Used for regular user processes (
CPU Bound vs. I/O Bound Processes
The scheduler often tries to differentiate between two types of processes:
- CPU-Bound: Processes that spend most of their time performing computations and rarely block for I/O. Examples: complex simulations, video encoding, data analysis loops. CFS tries to give them fair CPU time but doesn't necessarily need to prioritize them for responsiveness.
- I/O-Bound: Processes that spend most of their time waiting for I/O operations (disk reads/writes, network activity, terminal input). Examples: text editors, web servers waiting for requests, shells waiting for commands. They run in short bursts between I/O waits. The scheduler often tries to prioritize I/O-bound processes when they become runnable (after I/O completes) to improve system responsiveness and keep interactive applications feeling quick.
Commands for Managing Priorities (nice
and renice
)
-
nice
- Purpose: Launch a new command with a specified niceness level (priority).
- Syntax:
nice -n <niceness_level> <command> [arguments...]
<niceness_level>
: An integer, typically from -20 to 19. If omitted, defaults to adding 10 to the parent's niceness (making it lower priority).- Only root can specify negative increments (higher priority). Normal users can only specify positive increments (lower priority).
- Examples:
nice -n 15 ./my_batch_job.sh
: Start the script with a low priority (niceness 15).nice ./my_script.sh
: Start the script with niceness 10 (if parent was 0).sudo nice -n -5 ./important_task
: Start the task with higher priority (niceness -5).
-
renice
- Purpose: Change the niceness level of already running processes.
- Syntax:
renice [-n] <priority> [-p <PID>...] [-g <PGRP>...] [-u <user>...]
<priority>
: The absolute niceness value to set (from -20 to 19).-p <PID>
: Specify process IDs to change.-g <PGRP>
: Specify process group IDs to change.-u <user>
: Specify username or UID; change priority for all processes owned by that user.- The
-n
is often optional when setting the absolute priority.
- Permissions:
- A user can only
renice
processes they own, and can only increase the niceness value (lower the priority). - Root can
renice
any process to any valid niceness value.
- A user can only
- Examples:
renice 10 -p 12345
: Set the niceness of PID 12345 to 10 (lower priority). (Requires ownership or root).sudo renice -5 -p 12345
: Set the niceness of PID 12345 to -5 (higher priority). (Requires root).renice 19 -u bob
: Set the niceness of all of user bob's processes to 19 (lowest priority). (Requires root or being user bob).
Understanding scheduling and priorities helps you manage long-running tasks, ensure interactive performance, and optimize resource allocation on your Linux system.
Workshop Adjusting Process Priorities
In this workshop, you will launch CPU-intensive processes and use nice
and renice
to observe the effect of changing priorities on CPU allocation, as viewed through top
or htop
.
Objective: Understand how to launch processes with non-default priorities using nice
and modify the priority of running processes using renice
. Observe the impact on CPU usage in a multi-process environment.
Tools: yes
, nice
, renice
, top
or htop
, pkill
.
Prerequisites: A system with at least two CPU cores is helpful to clearly see the effects, but it will also work (though less dramatically) on a single-core system.
Steps:
-
Start Baseline CPU Load:
- Open a terminal (Terminal 1).
- Start a CPU-bound process with default priority (niceness 0):
-
Monitor with
top
orhtop
:- Open another terminal (Terminal 2).
- Start
htop
(preferred) ortop
: - Observe the
yes
process. It should be consuming close to 100% of one CPU core. Note itsNI
(Nice) value (should be0
) and%CPU
.
-
Start a Second CPU Load (Default Priority):
- Go back to Terminal 1.
- Start a second
yes
process, also with default priority: - Go back to Terminal 2 (
htop
/top
). - Observe the two
yes
processes. They should now be sharing the CPU time relatively equally (each getting around 50% on a single core being shared, or each consuming close to 100% if you have 2+ cores available for them). Both should haveNI
=0
.
-
Start a Third CPU Load with Lower Priority (
nice
):- Go back to Terminal 1.
- Start a third
yes
process, but this time give it a lower priority (higher niceness) using thenice
command: - Go back to Terminal 2 (
htop
/top
). - Observe the three
yes
processes.- The first two (
NI
=0) should still be getting a significantly larger share of the CPU time compared to the third one. - The third
yes
process should haveNI
=10
and a noticeably lower%CPU
value. - Explanation: The scheduler (CFS) sees that the third process has a lower weight due to its higher niceness value and allocates it proportionally less CPU time when the other higher-priority processes are also runnable.
- The first two (
-
Change Priority of a Running Process (
renice
):- Let's make the first
yes
process (PID40010
,NI
=0) even less important than the third one. We will increase its niceness value usingrenice
. - From Terminal 1 (or any terminal):
- You might see output like:
40010 (process ID) old priority 0, new priority 15
- You might see output like:
- Go back to Terminal 2 (
htop
/top
). - Observe the processes again:
- The second
yes
process (PID40020
,NI
=0) should now be getting the largest share of CPU time. - The third
yes
process (PID40030
,NI
=10) should get the next largest share. - The first
yes
process (PID40010
), which we just reniced, should now haveNI
=15
and the smallest share of CPU time.
- The second
- Let's make the first
-
(Optional - Requires Root) Give a Process Higher Priority (
renice
):- Let's make the third process (PID
40030
, currentlyNI
=10) the highest priority. This requires root privileges. - From Terminal 1 (or any terminal):
- You'll be prompted for your password.
- Go back to Terminal 2 (
htop
/top
). - Observe:
- The third
yes
process (PID40030
) should now haveNI
=-5
and should be consuming the most CPU time, significantly more than the others. - The second process (PID
40020
,NI
=0) gets the next share. - The first process (PID
40010
,NI
=15) gets the least.
- The third
- Let's make the third process (PID
-
Clean Up:
- Stop all the
yes
processes usingpkill
: - Close
htop
/top
(pressq
).
- Stop all the
Summary: This workshop demonstrated the practical use of nice
to start processes with altered priority and renice
to change the priority of running processes. By observing CPU usage in htop
/top
, you saw how the Linux scheduler allocates CPU time based on niceness values, giving preference to processes with lower niceness (higher priority). You also saw that setting higher priorities (negative niceness) requires root privileges.
6. Background and Foreground Processes (Job Control)
When you work in an interactive shell (like bash
or zsh
), you often run commands one after another. Typically, the shell waits for one command to finish before prompting you for the next. This is running a command in the foreground. However, shells also provide job control features that allow you to manage multiple processes concurrently, switching them between the foreground and background, and suspending/resuming them.
Foreground vs. Background
-
Foreground Process: A process running in the foreground has control of the terminal.
- It can read input directly from the keyboard.
- Its standard output and standard error are typically connected to the terminal display.
- The shell usually waits for the foreground process to complete before issuing a new prompt.
- It receives terminal-generated signals like
SIGINT
(Ctrl+C
) andSIGTSTP
(Ctrl+Z
). - There can only be one foreground process group associated with a terminal at any time.
-
Background Process: A process running in the background does not have control of the terminal.
- It cannot directly read input from the keyboard (attempting to do so usually causes it to be stopped by a
SIGTTIN
signal). - Its standard output and standard error are still typically connected to the terminal (unless redirected), so you might see its output interspersed with your foreground commands.
- The shell does not wait for background processes to complete; it immediately gives you a new prompt.
- It does not receive keyboard-generated signals like
SIGINT
orSIGTSTP
.
- It cannot directly read input from the keyboard (attempting to do so usually causes it to be stopped by a
Shell Job Control Commands
Most modern shells provide built-in commands for managing jobs (a job is essentially a pipeline of one or more processes started from the shell):
-
&
(Ampersand)- Purpose: Place this character at the end of a command line to start the command (or pipeline) as a background job.
- Example:
-
jobs
- Purpose: List the active jobs (backgrounded or stopped) associated with the current shell session.
- Output: Shows job number (e.g.,
[1]
), state (Running
,Stopped
), and the command. - Options:
jobs -l
: Also display the Process ID (PID) of the job's process group leader.jobs -p
: List only the PIDs of the process group leaders.
- Example:
- The
+
indicates the "current" job (default forfg
/bg
). - The
-
indicates the "previous" job.
- The
-
Ctrl+Z
- Purpose: Suspend (stop) the current foreground process.
- Action: Sends the
SIGTSTP
signal to the foreground process group. The process pauses execution, and the shell regains control of the terminal, usually printing the job number and "Stopped". - Example: Run
sleep 300
, then pressCtrl+Z
.
-
bg
(Background)- Purpose: Resume a stopped job and run it in the background.
- Syntax:
bg [%<job_number>]
- If
%<job_number>
is omitted, it usually acts on the "current" job (marked with+
byjobs
). You can specify a job using its number (e.g.,%1
,%2
).
- If
- Example: After stopping
sleep 300
withCtrl+Z
(let's say it's job[1]
), run: Nowsleep 300
is running in the background.
-
fg
(Foreground)- Purpose: Bring a backgrounded or stopped job into the foreground.
- Syntax:
fg [%<job_number>]
- If
%<job_number>
is omitted, it usually acts on the "current" job (+
).
- If
- Action: The shell gives control of the terminal back to the specified job. The shell waits for this job to complete or be suspended again.
- Example: If
sleep 300
is running in the background as job[1]
:
Job Specifiers (%
Notation):
%N
: Job number N (e.g.,%1
).%string
: Job whose command starts withstring
(e.g.,%sleep
).%?string
: Job whose command containsstring
.%%
or%+
: The current job (marked with+
).%-
: The previous job (marked with-
).
Detaching Processes Completely: nohup
and disown
What happens to background processes started with &
if you close the terminal or log out? By default, the shell often sends a SIGHUP
(Hangup) signal to its background jobs, which usually causes them to terminate. To keep a process running even after you log out, you need to detach it more thoroughly:
-
nohup <command> &
- Purpose: Run a command immune to hangups (
SIGHUP
) and redirect its output. - Action:
- Prevents the
SIGHUP
signal from terminating the command when the terminal closes. - Redirects standard output and standard error to a file named
nohup.out
in the current directory (or$HOME/nohup.out
if the current directory isn't writable), unless already redirected elsewhere. - Runs the command in the background implicitly (though adding
&
is common practice and ensures the shell gives the prompt back immediately).
- Prevents the
- Example:
- Purpose: Run a command immune to hangups (
-
disown
(Bash/Zsh built-in)- Purpose: Remove a job from the shell's active job table. This prevents the shell from sending
SIGHUP
to it on exit. - Syntax:
disown [-h] [%<job_id>]
- If no job ID is given, acts on the current job.
-h
: Mark the job soSIGHUP
is not sent, but keep it in the job table (less common). Without-h
, it's removed entirely.
- Usage: Often used after starting a job in the background (
&
) or stopping (Ctrl+Z
) and backgrounding (bg
). - Example:
- Purpose: Remove a job from the shell's active job table. This prevents the shell from sending
Advanced Session Management: screen
and tmux
For managing long-running processes, multiple shells, and sessions that persist even if your network connection drops, terminal multiplexers like screen
and tmux
are invaluable tools. They allow you to:
- Create persistent sessions on a remote server.
- Detach from a session, log out, log back in later, and reattach to the session exactly as you left it (including running processes).
- Have multiple virtual "windows" (like tabs) and "panes" (split views) within a single terminal connection.
While a deep dive into screen
or tmux
is beyond the scope of this section, they are the standard solution for robustly managing processes that need to outlive your terminal login session. nohup
and disown
are simpler mechanisms suitable for specific cases.
Workshop Managing Foreground and Background Jobs
This workshop provides hands-on practice with shell job control features.
Objective: Learn to start jobs in the background, stop foreground jobs, list active jobs, and switch jobs between foreground and background. Practice using nohup
.
Tools: Shell (bash
, zsh
, etc.), sleep
, vim
(or nano
, gedit
- any simple text editor), jobs
, fg
, bg
, nohup
, pkill
.
Steps:
-
Start a Background Job:
- Open a terminal.
- Start a
sleep
command in the background: - Note the job number and PID printed by the shell.
- Verify it's running in the background using
jobs
:- You should see the
sleep
command listed asRunning
.
- You should see the
-
Start and Stop a Foreground Job:
- Start a simple text editor (like
vim
ornano
) with a dummy filename. This will run in the foreground. - Your terminal is now controlled by the editor.
- Suspend the editor: Press
Ctrl+Z
. - The shell should print a message like
[2]+ Stopped vim temporary_file.txt
and give you the prompt back. - List the jobs again:
- You should now see two jobs: the
sleep
job (Running
) and the editor job (Stopped
). Note which one has the+
(current job).
- You should now see two jobs: the
- Start a simple text editor (like
-
Resume Stopped Job in Background (
bg
):- The editor job is currently stopped. Resume it in the background:
- The shell should indicate the job is now running in the background (e.g.,
[2]+ vim temporary_file.txt &
). - Check
jobs
again:- Both jobs (
sleep
andvim
) should now be listed asRunning
.
- Both jobs (
-
Bring Job to Foreground (
fg
):- Bring the backgrounded editor job back to the foreground:
- You are now back inside the editor.
- Exit the editor cleanly (e.g., in
vim
, type:q!
and press Enter; innano
, pressCtrl+X
). - Check
jobs
again. The editor job should be gone.
-
Bring Another Job to Foreground and Terminate:
- The
sleep 600
job (likely job%1
) is still running in the background. - Bring it to the foreground:
- The terminal will now seem to hang (it's sleeping).
- Terminate the foreground process: Press
Ctrl+C
(sendsSIGINT
). - You should get your shell prompt back. Check
jobs
– it should be empty.
- The
-
Using
nohup
:- Let's simulate a long-running script that should continue even if we close the terminal.
- Run
sleep
usingnohup
: - The shell will print the PID and a message like
nohup: ignoring input and appending output to 'nohup.out'
. - Check for the
nohup.out
file:ls nohup.out
. It should exist (it will likely be empty assleep
produces no output). - Check
jobs
. Thenohup sleep 300
command is listed as a background job. - Simulate Logout (Conceptual): If you were to close this terminal window now, the
sleep 300
process started withnohup
would not receiveSIGHUP
and would continue running until its 300 seconds are up. (We won't actually close the terminal here). - Clean up: Since it's still associated with your shell as a job, you can kill it using its job specifier or PID.
Summary: In this workshop, you practiced the core shell job control features: starting jobs in the background (&
), listing jobs (jobs
), suspending foreground jobs (Ctrl+Z
), resuming jobs in the background (bg
), and bringing jobs to the foreground (fg
). You also learned how to use nohup
to make a command immune to hangups, allowing it to potentially run after terminal closure. This gives you powerful tools for managing multiple tasks within a single shell session.
7. Inter-Process Communication (IPC) Overview
Processes often need to coordinate their actions or exchange data with each other. Inter-Process Communication (IPC) refers to the mechanisms provided by the operating system that allow processes to communicate and synchronize. Linux offers a rich set of IPC mechanisms, ranging from simple to complex, each suited for different needs.
Why IPC?
Processes run in their own protected virtual address spaces. One process cannot directly access the memory of another process for security and stability reasons. IPC mechanisms provide controlled ways to bridge this gap. Common reasons for using IPC include:
- Data Exchange: Sending data from one process to another (e.g., output of one command feeding into another).
- Notification: Informing another process that an event has occurred.
- Synchronization: Coordinating access to shared resources to prevent race conditions or ensure tasks happen in the correct order.
- Resource Sharing: Allowing multiple processes to access a shared resource (like shared memory).
- Client-Server Communication: Enabling request-response interactions (e.g., a web server communicating with worker processes).
Common Linux IPC Mechanisms (Overview)
Here's a brief overview of some key IPC methods available in Linux. Note that each of these could be a topic for a much deeper dive, especially concerning their programming interfaces.
-
Pipes (Unnamed Pipes)
- Concept: A unidirectional channel for communication between related processes (typically parent and child). Data written to one end of the pipe by one process can be read from the other end by the other process.
- Characteristics:
- Unidirectional (one-way data flow).
- Used between related processes (created via
pipe()
system call beforefork()
). - Kernel buffers data; acts like a producer-consumer queue.
- Implicitly used by shells for pipelines (e.g.,
ls -l | grep .txt
).
- Use Cases: Simple data streaming between parent/child or sibling processes. Shell pipelines.
-
FIFOs (Named Pipes)
- Concept: Similar to pipes but exist as special files within the filesystem hierarchy. This allows unrelated processes to communicate through them by opening the FIFO file.
- Characteristics:
- Unidirectional (though two can be used for bidirectional).
- Accessed via a filesystem path (
mkfifo
command ormkfifo()
system call). - Allows communication between any two processes that have permission to access the FIFO file.
- Data is still buffered by the kernel.
- Use Cases: Communication between unrelated processes on the same machine, simple client-server setups where processes know the FIFO path.
-
Signals
- Concept: As discussed previously (Section 4), signals are asynchronous notifications sent to processes.
- Characteristics:
- Primarily for notification, not bulk data transfer (though some limited data can be sent with real-time signals).
- Can be sent between any two processes with appropriate permissions (
kill()
system call). - Limited number of predefined signals, plus user-defined ones (
SIGUSR1
,SIGUSR2
).
- Use Cases: Notifying processes of events (termination request, child exit, timer expiry, configuration reload), basic synchronization.
-
System V IPC (Older, but still widely used)
- A suite of IPC mechanisms originating from UNIX System V:
- Message Queues (
msgget
,msgsnd
,msgrcv
,msgctl
)- Allows processes to exchange formatted messages via a kernel-managed queue.
- Processes identify queues by a unique key or ID.
- Messages can have types, allowing selective retrieval.
- Persistent (queues exist until explicitly removed or system reboot).
- Semaphores (
semget
,semop
,semctl
)- Used for synchronization. A semaphore is essentially a counter used to control access to shared resources.
- Processes can perform atomic operations (like decrementing - wait/P, or incrementing - signal/V) on semaphore values.
- Used to implement mutexes (mutual exclusion locks), enforce resource limits, etc.
- Persistent.
- Shared Memory (
shmget
,shmat
,shmdt
,shmctl
)- The fastest form of IPC for bulk data transfer.
- Allows multiple processes to map the same region of physical memory into their virtual address spaces.
- Processes can read and write to this shared region directly, without kernel mediation for each access.
- Requires external synchronization (like semaphores) to coordinate access and prevent corruption.
- Persistent.
- Message Queues (
- Administration: Commands like
ipcs
(list IPC objects) andipcrm
(remove IPC objects) are used to manage System V IPC resources.
- A suite of IPC mechanisms originating from UNIX System V:
-
POSIX IPC (More modern, generally preferred over System V for new development)
- An alternative, standardized set of IPC APIs:
- POSIX Message Queues (
mq_open
,mq_send
,mq_receive
,mq_close
,mq_unlink
)- Similar in concept to System V message queues but with a different API based on file descriptors.
- Often considered easier to use.
- Messages can have priorities.
- POSIX Semaphores (
sem_open
,sem_wait
,sem_post
,sem_close
,sem_unlink
)- Can be named (like FIFOs, accessible via
/name
) or unnamed (memory-based, for related processes/threads). - Simpler API than System V semaphores for basic locking.
- Can be named (like FIFOs, accessible via
- POSIX Shared Memory (
shm_open
,mmap
,munmap
,shm_unlink
,ftruncate
)- Uses file descriptors to refer to shared memory objects, often mapped into the virtual filesystem (
/dev/shm
). - Uses standard
mmap()
call to map the object into the process's address space. - Considered more flexible and integrated with filesystem concepts than System V shared memory. Still requires external synchronization.
- Uses file descriptors to refer to shared memory objects, often mapped into the virtual filesystem (
- POSIX Message Queues (
- An alternative, standardized set of IPC APIs:
-
Sockets (Network and Unix Domain)
- Concept: Provide a general-purpose endpoint for communication. Most commonly associated with network communication (using TCP/IP or UDP), but also usable for IPC on a single machine.
- Characteristics:
- Network Sockets (AF_INET, AF_INET6): Allow communication between processes on different machines across a network, or on the same machine via the loopback interface (
127.0.0.1
). Uses standard networking protocols. - Unix Domain Sockets (AF_UNIX / AF_LOCAL): Allow communication between processes on the same machine using a filesystem path as the socket address. More efficient than network sockets for local IPC as it avoids network stack overhead. Supports stream (like TCP) and datagram (like UDP) communication. File system permissions control access.
- Network Sockets (AF_INET, AF_INET6): Allow communication between processes on different machines across a network, or on the same machine via the loopback interface (
- Use Cases: Network client-server applications, local client-server applications (often via Unix domain sockets for performance), flexible bidirectional communication.
Choosing the right IPC mechanism depends heavily on the specific requirements: are the processes related? Do they run on the same machine? Is performance critical? Is simple notification enough, or is bulk data transfer needed? Do you need synchronization?
Workshop Using a Simple Pipe (|)
The most common and visible form of IPC for shell users is the pipe (|
), used to connect the standard output of one command to the standard input of another. This workshop demonstrates this fundamental IPC mechanism.
Objective: Understand how the shell pipe (|
) facilitates communication between two unrelated processes by connecting stdout to stdin.
Tools: Shell (bash
, zsh
, etc.), ls
, grep
, wc
.
Steps:
-
Command 1: Generate Output (
ls
)- Run the
ls
command to list files in a directory, for example, your home directory or/etc
. Let's use/etc
as it usually has many files. - Observe the output: A multi-line listing of files and directories with details. This output is being written to the standard output stream (stdout) of the
ls
process, which by default is connected to your terminal.
- Run the
-
Command 2: Filter Input (
grep
)- Run the
grep
command to search for lines containing a specific pattern. Let's search for files related tossh
: - Now,
grep
is waiting for input on its standard input stream (stdin). Type some lines, including one with "ssh", and pressCtrl+D
to signal end-of-input. grep
will print the line containing "ssh". This showsgrep
reads from stdin.
- Run the
-
Connecting Commands with a Pipe (
|
)- Now, let's use the pipe (
|
) operator to connect the stdout ofls -l /etc
directly to the stdin ofgrep ssh
. - Explanation:
- The shell sees the
|
. - It starts the
ls -l /etc
process. - It starts the
grep ssh
process. - Crucially, instead of connecting
ls
's stdout to the terminal andgrep
's stdin to the keyboard, the shell sets up an unnamed pipe (an IPC mechanism managed by the kernel). - It redirects
ls
's stdout to write into the pipe. - It redirects
grep
's stdin to read from the pipe. ls
runs, writing its output line by line into the pipe.grep
runs concurrently, reading lines from the pipe as they become available. If a line contains "ssh",grep
writes that line to its stdout (which is connected to your terminal).- When
ls
finishes and closes its end of the pipe,grep
detects end-of-input and terminates.
- The shell sees the
- Output: You will only see the lines from the
/etc
directory listing that contain the pattern "ssh".
- Now, let's use the pipe (
-
Chaining Multiple Pipes:
- Pipes can be chained together. Let's count how many configuration files (ending in
.conf
) are in/etc
: - Explanation:
ls /etc
: Lists files/directories in/etc
, one per line (without-l
,ls
often outputs differently when stdout is not a terminal). Output goes to Pipe 1.grep '\.conf$'
: Reads from Pipe 1. Filters for lines ending ($
) with.conf
(dot.
is escaped\
to match literally). Output goes to Pipe 2.wc -l
: Reads from Pipe 2. Counts the number of lines (-l
). Output goes towc
's stdout (the terminal).
- Output: A single number representing the count of files ending in
.conf
. - IPC: This involves three processes (
ls
,grep
,wc
) communicating sequentially via two kernel-managed pipes.
- Pipes can be chained together. Let's count how many configuration files (ending in
Summary: This workshop illustrated the use of unnamed pipes (|
) as a fundamental IPC mechanism in the shell. You saw how the pipe connects the standard output of one process to the standard input of another, allowing data to flow between them without intermediate files. This powerful concept enables the creation of complex command pipelines by composing simple, single-purpose tools. While simple, it clearly demonstrates the core idea of IPC: enabling separate processes to exchange data.
Conclusion
Process management is a cornerstone of the Linux operating system. We've journeyed from the basic definition of a process – a program in execution with its associated resources and context – through its lifecycle and various states (Running
, Sleeping
, Stopped
, Zombie
). You learned how to use essential tools like ps
, top
, and htop
to view, monitor, and analyze processes, examining their PIDs, PPIDs, states, and resource consumption.
We delved into the fundamental Linux mechanism for creating new tasks: the elegant fork()
and exec()
system call combination, understanding how parent-child relationships are formed and how new programs are loaded. We explored the critical role of signals for communication and control, learning how to terminate processes politely (SIGTERM
) or forcefully (SIGKILL
) using commands like kill
and pkill
, and how to suspend and resume them (SIGSTOP
, SIGCONT
).
Furthermore, we touched upon process scheduling, the kernel's mechanism for fairly distributing CPU time, and how user-space tools like nice
and renice
allow influencing process priorities. We practiced managing foreground and background jobs using shell job control (&
, jobs
, fg
, bg
, Ctrl+Z
) and learned how nohup
helps processes survive terminal closure.
Finally, we briefly surveyed the landscape of Inter-Process Communication (IPC), recognizing the need for processes to communicate and synchronize, and identifying mechanisms like pipes, FIFOs, signals, System V IPC, POSIX IPC, and sockets as the tools Linux provides to bridge the isolation between processes. The hands-on workshop using shell pipes provided a tangible example of IPC in action.
Mastering process management is essential for anyone serious about using Linux effectively. It empowers you to diagnose performance issues, manage system resources efficiently, control application execution, understand system behavior, and build more complex, coordinated applications. The concepts and tools covered here provide a solid foundation for further exploration into system administration, performance tuning, and systems programming on Linux. ```