Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs...

27
Dovecot Mail Storage Timo Sirainen

Transcript of Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs...

Page 1: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Dovecot Mail Storage

Timo Sirainen

Page 2: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Me: Timo Sirainen

• Born 1979 in Finland• First C64 BASIC programs around 1988• Open source coding since about 1998– Irssi IRC client 1999-2004, still widely used

• Worked as programmer since 1999• Went to university in 2006• Dovecot project started in 2002– Working full time on it since about 2007– 2009: Rackspace, USA– 2010: SAPO, Portugal

Page 3: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Dovecot

• Open source IMAP/POP3 server– Only mail retrieval to clients, no mail sending

• First version released in 2002• Mostly written by me– Except Sieve by Stephan Bosch

• High performance is an important goal– Disk I/O is typical bottleneck -> everything

optimized to reduce it

Page 4: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Talk Overview

• Traditional mailbox formats• Dovecot indexes• Dovecot mailbox formats• Full text search indexes• Future ideas

Page 5: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

mbox

• One file per mailbox• Metadata in headers that are filtered out– X-UID, Status, X-Status, X-Keywords, etc.

• Deleting requires moving data around– Fragile: corruption if crashes in the middle– Slow when deleting old messages

• May become fragmented with constant appends

• But non-fragmented file is fast to read

Page 6: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Maildir

• One file per message– Reading through all files can be slow

• Message flags in filename (name:2,<flags>)– Lots of renaming– Finding the current filename can be difficult

• Maildir is lockless? Not so much, Dovecot uses write/sync lock– Otherwise files can temporarily be lost during renames

• Was the file really deleted or just renamed?

Page 7: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Dovecot Index Files

• Main index– List of messages– Message flags– Offsets to cache records

• Cache file– Message size, some headers, etc.– Keep only data that client actually uses• Different clients want different data for different

amount of time

Page 8: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Dovecot Main Index• In two files:

– dovecot.index: Somewhat recent snapshot– dovecot.index.log: Recent changes

• All changes go through the log• Readers read snapshot to memory and apply latest changes from

log– Once opened, only need to read log updates

• Very efficient with remote filesystems (NFS, cluster FSes)!

• Snapshot is updated “once in a while”– Tries to minimize disk I/O– Writes are usually more expensive than reads

• Log also useful for finding “what changed” events for IMAP clients

Page 9: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Dovecot Cache

• The main reason for Dovecot’s good performance• Different IMAP clients want different data

– Caching data that client doesn’t use wastes disk space and disk I/O

• Flexible format, allows adding any number of fields– Per-field caching decisions: “no”, “temporary”, “permanent”

• Cached fields never change (IMAP guarantees)– Data is added without locking -> duplicate data is possible

• Once in a while the file is recreated -> deleted and unwanted records are dropped

Page 10: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Locking

• Lock waits are bad– Higher user visible latency– Timeout failures during high load

• Dovecot v0.99 used traditional read/write index locks– Locking timeout problems– Redesigned v1.0 to do lockless reads

Page 11: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Lockless reads: rename()

• For:– Small files– Rarely changing files– If a large part of the file changes

• Writer– Lock– If file has changed, read+update internal state– Write the updated data to temp file– rename() over the original file– Unlock

• Reader– Just read the file.

#1

#2

Temp file rename()

Page 12: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Lockless reads: Appends

• For append-only files with “size” header in each written record

• Writer– Lock– Write data with size=0– Write size with each byte’s highest bit set to 1– Unlock

• Reader– Read one record at a time– Stop when seeing a size that isn’t fully written

DataSize

Bits Content

0-6 Bits 0-6 of size

7 Always 1

8-14 Bits 7-13 of size

15 Always 1

etc.

Page 13: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Lockless writes in future?

• open(path, O_APPEND) usually provides atomic writes– Except with NFS– write() may also return less bytes than intended?

(signal, out of space)– read() during a write may see incomplete data?

Page 14: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Single-dbox

• One file per message (u.<IMAP UID>)• Files have immutable metadata section– GUID, POP3 UIDL, received date, etc.

• Advantages over Maildir:– Filenames don’t change– No IMAP UID <-> filename mapping required

• Flags stored only in Dovecot index files– Automatically creates dovecot.index.backup once in a while– When fixing corruption, tries very hard to preserve flags

based on (corrupted) index and backup files

Page 15: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Multi-dbox

• Multiple messages in a single file (m.<id>)– File format same as with single-dbox

• Multiple files in a single mailbox– Files are about 2 MB (configurable)

• Larger files -> less fragmentation, but deletion slower• Preallocation

– Can be rotated every n days (for incremental backups)– Delayed (ioniced) nightly deletions (“doveadm purge”)

• Crash or power loss can’t corrupt or lose data• Tries very hard to preserve as much data as possible in case

of (filesystem) corruption.– Saves a backup of the original broken file

Page 16: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Benchmarks

• Realistic IMAP benchmarks are difficult to do• Depends on clients and user behavior

Page 17: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Benchmarks

• Reading 10k messages via IMAPSSD, OSX, HFS+ Uncached Cached

mbox 2.9 s 1.6 s

Maildir 3.9 s 0.6 s

Single-dbox 3.9 s 0.6 s

Multi-dbox 1.5 s 0.4 s

HDD, Linux, ext4 Uncached Cached

mbox 2.8 s 2.3 s

Maildir 8.0 s 0.9 s

Single-dbox 6.8 s 0.9 s

Multi-dbox 1.6 s 0.7 s

Page 18: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Benchmarks: # NFS ops

• Reading 10k messages via IMAP• Above: uncached, below: cached

mdbox

sdbox

Maildir

mbox

0 5000 10000 15000 20000 25000 30000 35000

Reads

Lookup

Access

Getattr

Page 19: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Benchmarks: # NFS opsimaptest logout=5 msgs=1000 delete=10 expunge=10 secs=60 seed=1

Random IMAP commands sent with:

L+A+G = lookup + access + getattr

mbox

Maildir

sdbox

mdbox

ReadWriteReaddirL+A+GOther

Page 20: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

New dbox-only Features

Page 21: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Alternative Mail Storage

• Users rarely access their old mails• Lower performance storage is cheaper -> Move old mails there• dbox supports “alternative path” setting: If u.* or

m.* file isn’t found from primary path, it’s looked up from alternative path – Files could even be moved with /bin/mv

• But easier/safer with “doveadm altmove”

– This would be difficult with Maildir because its filenames change

Page 22: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Detached Mail Attachments

• MIME parts can be saved to external files– Only if they’re large enough (default: 128 kB)– Also can be filtered based on Content-Type, etc. headers

• Avoid extra disk seek for downloading attachments that clients automatically display inline

• Supports saving base64 encoded MIME parts decoded (25% less disk space)– Only if re-encoding can be done to 100% original

• dbox-only– Metadata contains pointers to external parts

• Saving is done via simplified “filesystem API”

Page 23: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Single Instance Storage

• Storage’s internal deduplication– Could be enabled only for attachment storage

• Dovecot’s SIS– FS API backend– Based on file hashes and hard links

• Hash is configurable (e.g. SHA256 + size)

– Byte-by-byte verification after hash founda) Never, trust hash uniqueness (not implemented)b) Immediate comparison during savingc) Delayed (nightly) comparison and deduplication

Page 24: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Dovecot SIS

• Attachments saved to “HA/SH/HASH-GUID” under global attachment dir (e.g. /var/attachments/)– GUID guarantees filename uniqueness– e.g. file with hash “123456” is saved to 12/34/123456-GUID– “HA” and “SH” may be symlinks to other mounts

• SIS is done by hard linking HA/SH/hashes/HASH to HA/SH/HASH-GUID if it exists.– Basically: “ln hashes/123456 123456-guid”– No attempts to create cross-mount hard links

• Safe to move/backup/restore attachment files– But hashes/HASH is auto-deleted only when its link count drops

from 2 to 1. External changes may leak it.

Page 25: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Full Text Search Indexes

• Dovecot has abstract FTS API• IMAP protocol says search is about “substring

matching” (e.g. “ello” matches “hello”)– Almost no FTS engines support this– Few people seem to care about this anymore

• Currently supported FTS backends:– Squat: Dovecot’s own indexer, supports substring

matching.• Currently index updating is too inefficient

– Apache Solr

Page 26: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

FTS: Solr

• Solr is a search engine server using Lucene• Dovecot talks to Solr via HTTP• Sharding via per-user fts_solr setting

Page 27: Dovecot Mail Storage Timo Sirainen. Me: Timo Sirainen Born 1979 in Finland First C64 BASIC programs around 1988 Open source coding since about 1998 –

Future

• FS API used for indexes and dbox– Support for key-value databases– Asynchronous disk I/O