Linux File Systems: Challenges and Futures · 2012 Storage Developer Conference. © Insert Your...
Transcript of Linux File Systems: Challenges and Futures · 2012 Storage Developer Conference. © Insert Your...
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Linux File Systems:Challenges and Futures
Ric WheelerRed Hat
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Overview
2
The Linux Kernel Process What Linux Does Well Today New Features in Linux File Systems Ongoing Challenges
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
What is Linux?
A set of projects and companiesVarious free and fee-based distributionsHardware vendors from handsets up to
mainframesMany different development communities
No single party can promise your favorite feature will be in all distributions
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Not Just a Kernel
RHEL has hundreds of projects each with Its own development community (upstream) Its own rules and processesChoice of licenses
Enterprise Linux vendorsWork in the upstream projectsTune, test and configureSupport the shipping versions
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
The Life Span of a Linux Enhancement
Origin of a featureDriven through standards like T10 or IETFPushed by a single vendorCreated by a developer
Proposed in the upstream communityPrototype patches postedFeedback and testingAdvocacy for inclusion
Move into a “free” distribution Shipped and supported by an enterprise
distribution
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
The Linux Community is Huge
Most active companies for the new 3.5 kernelNo affiliation - 12.0%Red Hat - 10.2% Intel - 9.7%Unknown - 7.8%Linaro - 4.7%Novell - 4.0%Texas Instruments - 2.9% IBM - 2.6%Linux Foundation - 2.5%Google - 2.4%Samsung - 2.3%
NetApp is at 17th place with 1.4% Statistics from: http://www.lwn.net/507986/
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
File & Storage Developers: LSF/MM 2012
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Things We Get Right
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
LVM and Block Layer LVM and device mapper has had an activity
spurtNew support for thin provisioned targetLVM can manage MD RAID devicesNative multipath increasingly used in high end
accounts New open source drivers for PCI-e SSD cards
Micron driver is now upstream Intel has been promoting the NVM express
standard and driver Multiple ways to use SSD devices as a cache
Bcache, vendor specific, fscache, dm-cache
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Ext3 File System
Ext3 was the most common file system in LinuxCommon default historically for distributionsApplications tuned to its behaviors (fsync...)Familiar to most system administrators
Ext3 challengesVery large file system repair (fsck) timeLimited scalability – 16TB max fs sizeSignificantly slower than other file systemsdirect/indirect, bitmaps, no delalloc ...
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
The Ext4 filesystem
Ext4 has many compelling new featuresExtent based allocationFaster fsck time (up to 10x over ext3)Delayed allocation, preallocationHigher bandwidth
Users still on EXT3 have an easy migration pathThe same commands and utilities
Large and active developer community that crosses multiple vendors
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
The XFS File System
XFS is very robust and scalableVery good performance for large storage
configurations and large serversMany years of use on large (> 16TB) storage
XFS is the most common file system used in serious storage appliances
Reasonable sized and active developer community that crosses a few vendors
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
The BTRFS File System
BTRFS is the newest local file system More integrated approach to the storage stack
Has its own internal RAID and snapshot support
Does full data integrity checks for metadata and user data
Compression supportCan dynamically grow and shrink
Ships in multiple enterprise and community distributions
Large and active developer community that crosses multiple vendors
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Active Maintenance and Development
Since kernel v2.6.18 (~RHEL5):Ext3 : 556 commits, ~136 authorsExt4 : 1649 commits, ~213 authorsXFS : 1857 commits, ~136 authorsBtrfs : 2228 commits, ~139 authors
Each file system has relatively few very active authors
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
New Features Tend to be Widely Supported Ext4, XFS, btrfs all have:
Delayed allocationPer-file space preallocationHole punch (not on btrfs yet)Trim / discardBarrier (now flush/FUA) supportDefragmentation
Ongoing work to unify the mount options across all file systems
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Linux NFS Servers
Support most of the 4.1 NFS specification Client supports all of the mandatory features
and all three pNFS layoutsServer supports most of the mandatory
features (not pNFS) Reasonably good performance for streaming
and small file workloads Supports pretty much any transport
Ethernet, IB, .... Increased involvement in IETF
Direct community member engagement and traditional standards people also reach out to Linux developers
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
CIFS (aka Samba) Support
Samba is robust and widely usedHas cluster support with CTDB
In kernel CIFS module provides an SMB client solution for Linux usersPerformance now approaches NFS
performance Good community relations with Microsoft
Multiple plugfest like events per year Regular engineering calls
Large, multi-vendor development communitySambaXP conference each year just for
Samba (and CIFS client) development
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
What Do We Get Wrong?
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Linux NFS Servers Problems
Experience with NFS 4.0 and 4.1 still relatively newExpect to get increases in user baseWill complete any lingering rough edges on
server implementation Lacks support for clustered NFS servers
Running with a shared file system back end “mostly” works
Ongoing work to resolve lock recovery deficiencies
No pNFS server code in upstream yet
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Samba Challenges
Microsoft is moving rapidly to SMB3.0Specification will be completely finalized once
Windows8 server shipsFixes performance issuesGood support for clustered servers
Samba support for SMB2.1 mostly there SMB3.0 development
Samba plans for SMB3.0 support underwayCIFS client support limited to SMB1
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Lack of Rich ACL Support
Windows and Linux/UNIX are really differentWindows locks are mandatoryLinux locks are advisory
Exporting the same file system via both NFS and CIFS leads to data corruption for lock users
Rich ACL patch provides the missing supportNeed the “rich ACL” patches to add support
for Windows style semantics Currently not actively being worked on
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Ext3 Challenges
Ext3 challengesFile system repair (fsck) time can be
extremely longLimited scalability - maximum file system size
of 16TB Major performance limitations
Can be significantly slower than other local file systems
Dwindling developer pool
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Ext4 Challenges
Ext4 challengesLarge device support (greater than 16TB) is
relatively newHas different behavior over system failure
than ext3 users are used to Usability concerns
Lots of mount options and tuning parametersRelies on complex and high powered tools to
support LVM and RAID configurations
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
XFS Challenges
XFS challengesNot as well known by many customers and
field support peopleUntil recently, had performance issues with
meta-data intensive (create/unlink) workloads (fixed in upstream and recent enterprise releases like RHEL6.2)
Similar usability concernsFair number of mount options and tuning
parametersRelies on complex and high powered tools to
support LVM and RAID configurations
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
BTRFS Worries
Repair tool still very young Ongoing worries with the hard bits of doing
“copy on write” file systemsENOSPC took a while (fixed now!)Encryption yet to comeCOW can fragment oft-written files
BTRFS performance analysis and testing gets less cycles than XFS and ext4 workChanging now that btrfs is in full support from
a few distros
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
All Things Management Related
Linux systems have a tradition of relying on third party management toolsLots of power tools for expertsFew tools appropriate for casual users
Many ways to do one thingMultiple RAID, SSD block caching layersOften the same thing is done – differently – in
multiple layers!
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Ongoing Work Worth Following
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
NFS & Samba
Active work on support for clustered storage Lock recovery work debate in NFS upstreamMultiple (out of tree) parallel NFS serversFedFS, migration support
All things to do with SMB3.0 Shared export of the same FS via NFS & SMB NFS V4.2 adds new support for
Copy offload, FedFS and Labeled NFS
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Ext4 Scaling & Features
Bigalloc (since kernel 3.2)Workaround for bitmap scalability issuesAllocates multiples of 4k blocks at a timeNot true large filesystem blocks, but close?
Inline Data - planned(maybe?)Store data inline in (larger) inodesMitigate bigalloc waste?
Metadata Checksumming - planned
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
XFS Scaling & Features
“Delayed logging” is donedramatically improved metadata performancedefault since v2.6.39Last™ big performance issue
Integrity work is nextCRCs on all metadata and logFS UUID to detect misdirected writesTransaction rollback in the face of errorsBackground scrub
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
BTRFS Scaling & Features
Scaling work here and there Mostly still fleshing out features
Checksumming was done earlyRAID 5/6QuotasDedupEncryption
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Caching on SSD Devices
Active work on getting Google's bcache ready for integration
Joe Thornber is working on a dm-cache target for device mapperSimple cache target which can be a layer in a
device mapper stacks.Modular policy which allows anyone to write
their own policyReuses the persistent-data library from thin
provisioninghttps://github.com/jthornber/linux-
2.6/blob/thin-dev/Documentation/device-mapper/dm-cache.txt
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Management Work
LibstoragemgmtProvides a library to do common block level
operations on storage arraysFull time developers and storage vendor
participationhttp://sourceforge.net/apps/trac/libstoragemgmt
System Storage ManagerBtrfs like “ease of use” for xfs, ext4 on top of
LVMhttp://sourceforge.net/p/storagemanager/hom
e/Home/
2012 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.
Resources & Questions
ResourcesLinux Weekly News: http://lwn.net/Mailing lists like linux-scsi, linux-ide, linux-
fsdevel, etc Storage & file system focused events
LSF workshopLinux Foundation events Linux Plumbers
IRC irc.freenode.net irc.oftc.net