SaLUG! @ Manifatture KNOS
19 Febbraio 2015
What problems are we trying to solve?
Reference: http://www.jedi.be/blog/2010/02/12/what-is-this-devops-thing-anyway/
- SaaS (Software as a Service)
- Continuous Deployment (a controlled and mostly automated pipeline which brings your code from development to production systems)
is a tool which helps you to keep track of the ongoing changes, and simplify forks and merges of parallel development branches.
(e.g. git or hg)
is a development workflow, where your changes on the code are driven by test cases, from a single class/component (test unit) to a group of components or the entire system (functional and integration tests)
Continuous Building is an automated pipeline which run your test suites on every change (e.g. a Jenkins master polls your RCS Server for changes and runs all your test cases in an appropriate slave system)
Agile Best Practices applied to SaaS (Software as a Service) use case:
from Continuous Building to Continuous Deployment
"DevOps is a cross-disciplinary community of practice dedicated to the study of building, evolving and operating rapidly-changing resilient systems at scale."
Jez Humble (Chef vice-president)
DevOps is also characterized by operations staff making use many of the same techniques as developers for their systems work:
- RCS (e.g. git or hg)
- TDD (e.g. unit, functional and integration tests)
- Continuous Building (e.g. jenkins)
DevOps means
extending Agile principles from the code to the entire service.
So... How can we introduce Continuous Deployment?
- repeatability
- testability
How to achieve repeatability and testability?
- Configuration Management
- System Automation
- System Orchestration
- Virtual Machines & Containers
Configuration Management and System Automation help us to:
System Orchestration, which complements Configuration Management and System Automation, helps us to:
Virtual Machines and Containers, which isolate the environment we're building, testing or running, give to us:
intro to linux isolation technologies
an operation that changes the apparent root directory for the current running process and its children
Reference: https://en.wikipedia.org/wiki/Chroot
is, at the same time, the name of the system call and the executable which wraps it for the sysadmins.
a lot of Unix daemons use the chroot system call internally as a Privilege Separation feature.
(e.g. Postfix chroots its helper programs to reduce the security risks)
a lot of building tools use the chroot (system call or executable wrapper) as an Isolation feature.
(e.g. debootstrap use is to create a new Debian/Ubuntu/etc. distro in a defined directory)
debootstrap
an ubuntu release in a target dir
$ sudo debootstrap trusty ./trusty-rootfs
I: Retrieving Release
I: Retrieving Release.gpg
I: Checking Release signature
I: Valid Release signature (key id 790BC7277767219C42C86F933B4FE6ACC0B21F32)
I: Retrieving Packages
I: Validating Packages
I: Resolving dependencies of required packages...
I: Resolving dependencies of base packages...
I: Checking component main on http://archive.ubuntu.com/ubuntu...
I: Retrieving adduser 3.113+nmu3ubuntu3
I: Validating adduser 3.113+nmu3ubuntu3
I: Retrieving apt 1.0.1ubuntu2
...
I: Configuring ubuntu-minimal...
I: Configuring libc-bin...
I: Configuring initramfs-tools...
I: Base system installed successfully.
chroot
in the rootfs created by debootstrap (1/2)$ ls -l trusty-rootfs
total 76
drwxr-xr-x 2 root root 4096 feb 13 18:47 bin
drwxr-xr-x 2 root root 4096 apr 11 2014 boot
drwxr-xr-x 3 root root 4096 feb 13 18:46 dev
drwxr-xr-x 61 root root 4096 feb 13 18:47 etc
...
$ sudo chroot trusty-rootfs /bin/bash
root@tardis:/# ls proc/
root@tardis:/# ls sys/
root@tardis:/# mount
warning: failed to read mtab
chroot
in the rootfs created by debootstrap (2/2)root@tardis:/# mount -t proc proc /proc
root@tardis:/# mount
procfs on /proc type proc (rw)
root@tardis:/# ls /proc/
1 1435 16 18635 2297 25 3 37 56 9 loadavg
10 1436 1601 18636 22986 2500 30541 379 5654 922 locks
1021 1448 1646 18637 23 25027 3059 38 566 97 mdstat
...
root@tardis:/# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 38544 7964 ? Ss Feb02 0:05 /sbin/init
root 2 0.0 0.0 0 0 ? S Feb02 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Feb02 0:05 [ksoftirqd/0]
...
Capabilities is a feature of the Linux kernel (modelled after the POSIX capabilities draft spec, called POSIX.6/POSIX1003.1e, never finalized and then dropped) which are a partitioning of all the root privileges into a set of distinct capabilities.
Reference: https://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.2/capfaq-0.2.txt
Every linux process has three sets of bitmaps called:
- effective capabilities (E): the capabilities enabled
- permitted capabilities (P): the capabilities the process can use (e.g. can be enabled / disabled)
- inheritable capabilities (I): the capabilities the process has and that should be inherited by its child processes
ping
command (1/3)$ ls -l `which ping`
-rwsr-xr-x 1 root root 44168 mag 7 2014 /bin/ping
$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=51 time=56.4 ms
...
ping
command (2/3)$ sudo cp `which ping` ./ping
$ ls -l ./ping
-rwxr-xr-x 1 root root 44168 mag 7 2014 /bin/ping
$ ./ping 8.8.8.8
ping: icmp open socket: Operation not permitted
ping
command (3/3)$ sudo setcap CAP_NET_RAW+p ./ping
$ ls -l ./ping
-rwxr-xr-x 1 root root 44168 mag 7 2014 /bin/ping
$ getcap ./ping
./ping = cap_net_raw+p
$ ./ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=51 time=57.2 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=51 time=56.3 ms
...
Namespaces are a feature of the Linux kernel useful in creating processes more isolated from the system they live in, without needing a full virtualization technology.
Reference: https://www.howtoforge.com/linux-namespaces
Starting from Linux v2.6.24, the clone system call supports 6 different type of namespaces flags.
Reference: http://manpages.ubuntu.com/manpages/trusty/man2/clone.2.html
- CLONE_NEWIPC: isolated SystemV IPC and POSIX Message Queues
- CLONE_NEWPID: isolated PIDs
- CLONE_NEWNET: isolated networking (/proc/net, interfaces, routes)
- CLONE_NEWNS: isolated mount points (like a security improved chroot)
- CLONE_NEWUTS: isolated hostname
- CLONE_NEWUSER: isolated users and groups ids (recently added)
ip netns
sub-command (1/7)$ sudo ip netns add test01
$ sudo ip netns add test02
$ sudo ip netns list
test02
test01
ip netns
sub-command (2/7)$ sudo ip netns exec test01 ip link
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
$ ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DORMANT group default qlen 1000
link/ether c8:f7:33:da:68:0d brd ff:ff:ff:ff:ff:ff
...
ip netns
sub-command (3/7)$ sudo ip link add vethtest01 type veth peer name vethtest02
$ sudo ip link show vethtest01
66: vethtest01: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e2:0b:93:90:85:76 brd ff:ff:ff:ff:ff:ff
$ sudo ip link show vethtest02
65: vethtest02: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether da:15:be:2f:06:10 brd ff:ff:ff:ff:ff:ff
ip netns
sub-command (4/7)$ sudo ip link set vethtest01 netns test01
$ sudo ip link set vethtest02 netns test02
$ sudo ip netns exec test01 ip l
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
66: vethtest01: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e2:0b:93:90:85:76 brd ff:ff:ff:ff:ff:ff
$ sudo ip netns exec test01 ip link set dev vethtest01 name eth0
$ sudo ip netns exec test02 ip link set dev vethtest02 name eth0
$ sudo ip netns exec test01 ip l
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
66: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether e2:0b:93:90:85:76 brd ff:ff:ff:ff:ff:ff
ip netns
sub-command (5/7)$ sudo ip netns exec test01 ip link set eth0 up
$ sudo ip netns exec test01 ip addr add 192.168.100.1/24 dev eth0
$ sudo ip netns exec test02 ip link set eth0 up
$ sudo ip netns exec test02 ip addr add 192.168.100.2/24 dev eth0
ip netns
sub-command (6/7)In due terminali separati:
$ sudo ip netns exec test01 nc -l -p 8080
echo1
echo2
$ sudp ip netns exec test01 netstat -tuapln
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 21835/nc
$ sudo ip netns exec test02 telnet 192.168.100.1 8080
echo1
echo2
ip netns
sub-command (7/7)$ sudo ip netns del test01
$ sudo ip netns del test02
limits is a feature of the linux kernel that can enforce "limits" the resources a process/user can consume
Reference: https://wiki.debian.org/Limits
pam_limits is a pam module which enforce limits on all the session opened, as requested by /etc/security/limits.conf:
$ less /etc/security/limits.conf
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain> <type> <item> <value>
#
* soft core 0
root hard core 100000
* hard rss 10000
@student hard nproc 20
@faculty soft nproc 20
@faculty hard nproc 50
ftp hard nproc 0
ftp - chroot /ftp
@student - maxlogins 4
Every user can change its own soft limit between zero and the hard limit
## Show the current Hard limit for "memlock"
$ ulimit -H -l
64
## Show the current Soft limit for "memlock"
$ ulimit -S -l
64
## Show all the hard limits
$ ulimit -H -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
...
cgroups (control groups) is a feature of the linux kernel that limits, accounts for and isolates the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes.
Reference: https://en.wikipedia.org/wiki/Cgroups
Cgroups provides:
- Resource limitation: groups can be set to not exceed a configured memory limit, which also includes the file system cache
- Prioritization: some groups may get a larger share of CPU utilization or disk I/O throughput
- Accounting: measures how much resources certain systems use, what may be used, for example, for billing purposes
- Control: freezing the groups of processes, their checkpointing and restarting[10]
Cgroups available subsystems (resource controllers):
- blkio subsystem sets limits on input/output access to and from block devices such as physical drives (disk, solid state, USB, etc.).
- cpu subsystem uses the scheduler to provide cgroup tasks access to the CPU.
- cpuacct subsystem generates automatic reports on CPU resources used by tasks in a cgroup.
- cpuset subsystem assigns individual CPUs (on a multicore system) and memory nodes to tasks in a cgroup.
Cgroups available subsystems (resource controllers):
- devices subsystem allows or denies access to devices by tasks in a cgroup.
- freezer subsystem suspends or resumes tasks in a cgroup.
- memory subsystem sets limits on memory use by tasks in a cgroup, and generates automatic reports on memory resources used by those tasks.
- net_cls subsystem tags network packets with a class identifier (classid) that allows the Linux traffic controller (tc) to identify packets originating from a particular cgroup task.
- net_prio subsystem provides a way to dynamically set the priority of network traffic per network interface.
- ns subsystem integrates the Linux kernel namespace feature.
$ cat /proc/self/cgroup
11:hugetlb:/user/1000.user/c2.session
10:perf_event:/user/1000.user/c2.session
9:blkio:/user/1000.user/c2.session
8:freezer:/user/1000.user/c2.session
7:devices:/user/1000.user/c2.session
6:memory:/user/1000.user/c2.session
5:cpuacct:/user/1000.user/c2.session
4:cpu:/user/1000.user/c2.session
3:name=systemd:/user/1000.user/c2.session
2:cpuset:/user/1000.user/c2.session
$ ls /sys/fs/cgroup/cpuset/user/1000.user/c2.session
cgroup.clone_children cpuset.mem_hardwall cpuset.sched_load_balance
cgroup.event_control cpuset.memory_migrate cpuset.sched_relax_domain_level
cgroup.procs cpuset.memory_pressure notify_on_release
cpuset.cpu_exclusive cpuset.memory_spread_page tasks
cpuset.cpus cpuset.memory_spread_slab
cpuset.mem_exclusive cpuset.mems
$ sudo mount -t tmpfs cgroup_root /sys/fs/cgroup
$ sudo mkdir /sys/fs/cgroup/cpuset
$ sudo mount -t cgroup cpuset -ocpuset /sys/fs/cgroup/cpuset
$ cd /sys/fs/cgroup/cpuset
$ sudo mkdir Charlie
$ cd Charlie
$ sudo -c '/bin/echo 2-3 > cpuset.cpus'
$ sudo -c '/bin/echo 1 > cpuset.mems'
$ sudo -c "/bin/echo $$" > tasks
$ sh
# The subshell 'sh' is now running in cgroup Charlie
# The next line should display '/Charlie'
$ cat /proc/self/cgroup
Reference: https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt
The current default sandbox of xdg-app setup is:
- All processes run as the user with no capabilities
- A filesystem namespace where:
- / is a private tmpfs not visible anywhere else. This is pivot_root:ed into so it is the new / and all other mounts from the host are unmounted from the namespace.
- /usr is a bind mount of the runtime
- /self is a bind mount of the application
- /var is a bind mount of the per-application, per-user writable data store
- /proc shows only the processes in the app sandbox
- ...
http://blogs.gnome.org/alexl/2015/02/17/first-fully-sandboxed-linux-desktop-app/