mkrepo: automating rpm and deb package lifecycle on s3

Post on 16-Jan-2017

320 views 1 download

Transcript of mkrepo: automating rpm and deb package lifecycle on s3

⬢mkrepoAutomating package lifecycle on s3

by 🐦racktear

State of packages

DockerHub repo popularityubuntu -  4.8k 

nginx - 4.3k 

mysql - 3.2k 

node - 2.9k 

redis - 2.8k 

centos - 2.7k 

Debian derivatives

What about Alpine Linux?

$ cat control.tar.gz \ data.tar.gz > \ mypackage-1.0-r0.apk

Vintage package formats are here to stay

What's the deal with packages?

Ad-hoc builds on a VPS sort of work

Publishing from jenkins is doable, but clunky

Concourse.ci is stateless ⇒ forget about it

Amazon S3 is not usable without repo duplication

An ideal repo tool

• Experiment with packages on your laptop

• Run self-hosted repo in one command

• Generate metadata on remote machines/services

• Embed well into CD pipeline

• Run as a service

Package repositories

Functions

Mapping package names to URIs

Recording package checksums

Proving authorship through signatures

Typical repository

foo-1.0.bin

foo-2.1.bin

bar-1.5.bin

Metadatafoo 1.0, sha256=... uri=...

foo 2.1, sha256=... uri=...

bar 1.5, sha256=... uri=...

GPG signature

Metadata.asc

Every stack reinvents repository formats :(

rpm repo structure

rpm repo structure

Top-level xml metadata file

Separate xml file list

Separate xml package mapping

rpm: Binary package files in own format

rpm repo structure

/

repodata/

Packages/

repomd.xmlfilelists.xml.gzprimary.xml.gzother.xml.gz

foo-1.2.rpmbar-2.0.rpm...

rpm file

deb repo structure

deb repo structure

Top-level text metadata file

Per-architecture text files with package lists

(optional) Separate text file with file lists

deb: ar package with metadata and content

deb repo structure

/

dists/

pool/

Packages

f/foo-1.2-dist1.debb/bar-2.0-dist2.deb...

dist1/component/arch/

dist2/

Release

deb file structure

control file

/

dists/

pool/

Packages

f/foo-1.2-dist1.debb/bar-2.0-dist2.deb...

dist1/component/arch/

dist2/

Release

control

Parsing package metadata

Binary

+ struct module

struct module

ver, reserved, num_index_entries, num_data_bytes = \ struct.unpack('>BIII', fd.read(13))

struct module

This way you can unpack almost any binary

I even did it once for RIP protocol

Also, you can experiment interactively

Generating repository

Generating rpm repository

Repodata has timestamps

Files have timestamps

No need to re-download old files

Generating deb repository

Metadata has no timestamps

Files have timestamps

Not clear how to calculate diff

Generating deb repository

Fortunately, Package index documentation states:

Note that the control file of .deb files may contain additional fields not yet documented by policy or not yet documented here which then might also be

found in this file.

Generating deb repository

...Size: 3906802FileTime: 1474223050.0 # <- custom field for timestampMD5Sum: 92c2a4SHA1: 257d6e...SHA256: c019a1......

"Packages" metadata file

Adapting to modern CD

Why store remotely?

Object storage is good candidate for packages

Specialized storage has better guarantees

SOA

Travis + s3

Concourse + Minio

Our concourse pipeline

build.tarantool.org

Ideal pipeline

Future repository format

• Git-like, with history

• Support for merges or partial updates

• Fast and lightweight

• Transactional

• Extensible through custom metadata

Thanks!Konstantin Nazarov

🐦racktearmail@kn.am