Django Files — A Short Talk (slides only)

110
Files @jaylett

Transcript of Django Files — A Short Talk (slides only)

Page 1: Django Files — A Short Talk (slides only)

Files@jaylett

Page 2: Django Files — A Short Talk (slides only)

Filesin an exciting adventure with dinosaurs

Page 3: Django Files — A Short Talk (slides only)

Files

in an exciting adventure with dinosaurs

Page 4: Django Files — A Short Talk (slides only)

Filesa brief talk

Page 5: Django Files — A Short Talk (slides only)

TZ=CET ls -ltr talk/-rwxr--r-- 1 jaylett django 6 6 Nov 14:01 Files

-rwxr--r-- 1 jaylett django 5 6 Nov 14:02 Files and HTTP

-rwxr--r-- 1 jaylett django 15 6 Nov 14:04 Files in the ORM

-rwxr--r-- 1 jaylett django 13 6 Nov 14:08 Storage backends

-rwxr--r-- 1 jaylett django 25 6 Nov 14:20 Static files

-rwxr--r-- 1 jaylett django 8 6 Nov 14:35 Form media

-rwxr--r-- 1 jaylett django 29 6 Nov 14:40 Asset pipelines

-rwxr--r-- 1 jaylett django 6 6 Nov 14:55 What next?

Page 6: Django Files — A Short Talk (slides only)

Files

Page 7: Django Files — A Short Talk (slides only)

Python files

Page 8: Django Files — A Short Talk (slides only)

Django files

Page 9: Django Files — A Short Talk (slides only)

Django files

Awesome!

Page 10: Django Files — A Short Talk (slides only)

The File family

• File — or ImageFile, if it might be an image

• ContentFile / SimpleUploadedFile in tests

• which have a different parameter order

Page 11: Django Files — A Short Talk (slides only)

• #10541 cannot save file from a pipe

Page 12: Django Files — A Short Talk (slides only)

Files and HTTP

Page 13: Django Files — A Short Talk (slides only)

UploadedFile

• “behaves somewhat like a file object”

• temporary file and memory variants

• custom upload handlers

Page 14: Django Files — A Short Talk (slides only)

forms.FileField# forms-filefield.py

class FileForm(forms.Form): uploaded = forms.FileField()

def upload(request): if request.method == 'POST': form = FileForm(request.POST, request.FILES) if form.is_valid(): request.FILES['uploaded'] # do something! return HttpResponseRedirect('/next/') else: form = FileForm() return render_to_response( 'upload.html', {'form': form}, )

Page 15: Django Files — A Short Talk (slides only)

Again, it works

Page 16: Django Files — A Short Talk (slides only)

• #15879 multipart/form-data filename="" not handled as file

• #17955 Uploading a file without using django forms

• #18150 Uploading a file ending with abackslash fails

• #20034 Upload handlers provide no way to retrieve previously parsed POST variables

• #21588 "Modifying upload handlers on the fly" documentation doesn't replicate internal magic

Page 17: Django Files — A Short Talk (slides only)

Files in the ORM

Page 18: Django Files — A Short Talk (slides only)

Files in the ORM# orm-file.py

class Wub(models.Model): infosheet = models.FileField()

>>> w = Wub(infosheet="relative/to/media/root.pdf")>>> print w.infosheet.urlhttps://media.root/relative/to/media/root.pdf>>> w.infosheet = ContentFile("A boring bit of text", “file.txt")>>> print w.infosheet.urlhttps://media.root/file.txt

Page 19: Django Files — A Short Talk (slides only)

• #5619 FileField and ImageField return the wrong path/urlbefore calling save_FOO_file()

• #10244 FileFields can't be set to NULL in the db

• #13809 FileField open method is only accepting 'rb' modes

• #14039 FileField special-casing breaks MultiValueField including a FileField

• #13327 FileField/ImageField accessor methods throw unnecessary exceptions when they are blank or null.

• #17224 determine and document the use of default option in context of FileField

• #25547 refresh_from_db leaves FieldFile with reference to db_instance

Page 20: Django Files — A Short Talk (slides only)

Files in the ORM# orm-file.py

class Wub(models.Model): infosheet = models.FileField()

>>> w = Wub(infosheet="relative/to/media/root.pdf")>>> print w.infosheet.urlhttp://media.root/relative/to/media/root.pdf>>> w.infosheet = ContentFile("A boring bit of text", “file.txt")>>> print w.infosheet.urlhttp://media.root/file.txt>>> w.infosheet = None>>> print w.infosheet.url

Page 21: Django Files — A Short Talk (slides only)

Files in the ORM# orm-file.py

class Wub(models.Model): infosheet = models.FileField()

>>> w = Wub(infosheet="relative/to/media/root.pdf")>>> print w.infosheet.urlhttp://media.root/relative/to/media/root.pdf>>> w.infosheet = ContentFile("A boring bit of text", “file.txt")>>> print w.infosheet.urlhttp://media.root/file.txt>>> w.infosheet = None>>> print w.infosheet.url>>> print type(w.infosheet)FieldFile <class 'django.db.models.fields.files.FieldFile'>

Page 22: Django Files — A Short Talk (slides only)

FieldFile

• magical autoconversion for anything (within reason)

• this happens using FileDescriptor classes which, well, let’s just ignore that

Page 23: Django Files — A Short Talk (slides only)

FileField# orm-fieldfile.py

class Wub(models.Model): infosheet = models.FileField()

>>> w = Wub(infosheet="relative/to/media/root.pdf")>>> w.infosheet = None>>> w.infosheet == NoneTrue>>> w.infosheet is NoneFalse

Page 24: Django Files — A Short Talk (slides only)

• #18283 FileField should not reuse FieldFiles

Page 25: Django Files — A Short Talk (slides only)

In ModelForms

# modelforms-filefield.py

class Wub(models.Model): infosheet = models.FileField()

class WubCreate(CreateView): model = Wub fields = ['infosheet']

Page 26: Django Files — A Short Talk (slides only)

ImageField# orm-imagefile.py

class Wub(models.Model): infosheet = models.FileField() photo = models.ImageField()

>>> w = Wub(infosheet="relative/to/media/root.pdf", photo="relative/to/media/root.png")>>> w.photo.width, w.photo.height(480, 200)

Page 27: Django Files — A Short Talk (slides only)

• #15817 ImageField having[width|height]_field set sytematically compute the image dimensions in ModelForm validation process

• #18543 Non image file can be saved to ImageField

• #19215 ImageField's “Currently” and “Clear” Sometimes Don't Appear

• #21548 Add the ability to limit file extensions for ImageField and FileField

Page 28: Django Files — A Short Talk (slides only)

So many classes

Page 29: Django Files — A Short Talk (slides only)

So many classes

Page 30: Django Files — A Short Talk (slides only)

ImageField# orm-imagefile-proxies.py

class Wub(models.Model): infosheet = models.FileField() photo = models.ImageField(width_field='photo_width', height_field='photo_height') photo_width = models.PositiveIntegerField(blank=True) photo_height = models.PositiveIntegerField(blank=True)

>>> w = Wub(infosheet="relative/to/media/root.pdf", photo="relative/to/media/root.png")>>> w.photo.width, w.photo.height(480, 200)>>> w.photo_width, w.photo_height(480, 200)

Page 31: Django Files — A Short Talk (slides only)

• #8307 ImageFile use of width_field and height_field is slow with remote storage backends

• #13750 ImageField accessing height or width and then data results in "I/O operation on closed file”

Page 32: Django Files — A Short Talk (slides only)

Storage backends

Page 33: Django Files — A Short Talk (slides only)

Storing files

• Stored in MEDIA_ROOT

• Served from MEDIA_URL

• Uses FileSystemStorage…by default

Page 34: Django Files — A Short Talk (slides only)

Another abstraction

Page 35: Django Files — A Short Talk (slides only)

What can Storage do?

• oriented around files, rather than a general FS API

• open / save / delete / exists / size / path / url

• make an available name, ie one that doesn’t clash

• modified, created, accessed times

• … also listdir

Page 36: Django Files — A Short Talk (slides only)

Choosing storage# configuring-storage-settings.pyDEFAULT_FILE_STORAGE = 'dotted.path.Class'MEDIA_ROOT = '/my-root'MEDIA_URL = 'https://media.root/'

class MyStorage(FileSystemStorage): def __init__(self, **kwargs): kwargs.setdefault( 'location', '/my-root' ) kwargs.setdefault( 'base_url', 'https://media.root/' ) return super( MyStorage, self ).__init__(**kwargs)

Page 37: Django Files — A Short Talk (slides only)

Choosing storage

# models.pyfrom mystorage import BetterStoragestorage_instance = BetterStorage()

class MyModel(models.Model): upload = models.FileField( storage=storage_instance )

Page 38: Django Files — A Short Talk (slides only)

• #9586 Shall upload_to return an urlencodedstring or not?

• #12157 FileSystemStorage does file I/O inefficiently, despite providing options to permit larger blocksizes

• #15799 Document what exception should be raised when trying to open non-existent file

• #21602 FileSystemStorage._save() Should Save to a Temporary Filename and Rename to Attempt to be Atomic

• #23759 Storage.get_available_name should preserve all file extensions, not just the first one

• #23832 Storage API should provide a timezone aware approach

Page 39: Django Files — A Short Talk (slides only)

Reasons to override

• Model fields with different options

• Different storage engine entirely

Page 40: Django Files — A Short Talk (slides only)

Protected storageFSS = FileSystemStorage

pstore = FSS( location=‘/protected’, base_url="/p/",)

urlpatterns += patterns( url( r'^p/(?P<path>.*)$', protected, ),)

class Profile(models.Model): resume = FileField( null=True, blank=True, storage=pstore, )

@login_requireddef protected(request, path): f = pstore.open(path) return HttpResponse( f.chunks() )

Page 41: Django Files — A Short Talk (slides only)

S3BotoStorage

• One of many in django-storages-redux

• Millions of options, somewhat undocumented

• configure with AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_STORAGE_BUCKET_NAME

• defaults for the rest mostly sane

Page 42: Django Files — A Short Talk (slides only)

Fun with S3Boto

AWS_S3_CUSTOM_DOMAIN='cdn.eg.com'AWS_HEADERS={ 'Cache-Control': 'max-age=31536000'} # a year or soAWS_PRELOAD_METADATA=True

Page 43: Django Files — A Short Talk (slides only)

Protected S3 storageprotected_storage = S3BotoStorage( acl='private', querystring_auth=True, querystring_expire=600, # 10 minutes, try to ensure people # won’t / can't share)

model.field.url # works as expected

Page 44: Django Files — A Short Talk (slides only)

Code that uses storage

• Please test on both Windows and Linux/Unix

• Please test with something remote like S3Boto

• Please write your own tests to work with different storage backends

Page 45: Django Files — A Short Talk (slides only)

Static filesHelpful assets

Page 46: Django Files — A Short Talk (slides only)

staticfiles in 1.3

Page 47: Django Files — A Short Talk (slides only)

How it works<!-- source.html -->{% load static %}<img src='{% static "asset/path.png" %}' width='200' height='100' alt=''>

<!-- out.html --><img src='https://static.root/asset/path.png' width='200' height='100' alt=‘'>

Page 48: Django Files — A Short Talk (slides only)

In development

Page 49: Django Files — A Short Talk (slides only)

collectstatic

Page 50: Django Files — A Short Talk (slides only)

• #24336 static server should skip for protocol-relative STATIC_URL

• #25022 collectstatic create self-referential symlink

Page 51: Django Files — A Short Talk (slides only)

staticfiles in 1.4

Page 52: Django Files — A Short Talk (slides only)

staticfiles in 1.4

Page 53: Django Files — A Short Talk (slides only)

• #23563 Make `staticfiles_storage` a public API

• #25484 static template tag outputs invalid HTML if storage class's url() returns a URI with '&' characters.

Page 54: Django Files — A Short Talk (slides only)

I make it sound badReally, it’s not

Page 55: Django Files — A Short Talk (slides only)

Still a moderately bad time

Page 56: Django Files — A Short Talk (slides only)

Still a moderately bad time

Page 57: Django Files — A Short Talk (slides only)

Serving assets best practice

Page 58: Django Files — A Short Talk (slides only)

Serving assets best practice

Page 59: Django Files — A Short Talk (slides only)

Serving assets best practice

Page 60: Django Files — A Short Talk (slides only)

Cached (1.4) / Manifest (1.7)

• Cache uses the main cache, or a dedicated one

• Manifest writes a JSON manifest file to the configured storage backend

Page 61: Django Files — A Short Talk (slides only)

The problem

<!-- hashed-assets.html --><script src=“/static/admin/js/ collapse.min.c1a27df1b997.js”><script src=“/admin/editorial/ article/2/autosave_variables.js"><script src=“/static/autosave/js/ autosave.js”>

Page 62: Django Files — A Short Talk (slides only)

The problem

<!-- hashed-assets.html --><script src=“/static/admin/js/ collapse.min.c1a27df1b997.js”><script src=“/admin/editorial/ article/2/autosave_variables.js"><script src=“/static/autosave/js/ autosave.js?v=2”>

Page 63: Django Files — A Short Talk (slides only)

That’s not all

Page 64: Django Files — A Short Talk (slides only)

That’s not all

Page 65: Django Files — A Short Talk (slides only)

That’s not all

Page 66: Django Files — A Short Talk (slides only)

That’s not all

Page 67: Django Files — A Short Talk (slides only)

Manifests for artefactsclass LocalManifestMixin(ManifestFilesMixin): _local_storage = None

def __init__(self, *args, **kwargs): super(LocalManifestMixin, self).__init__(*args, **kwargs) self._local_storage = FileSystemStorage()

def read_manifest(self): try: with self._local_storage.open(self.manifest_name) as manifest: return manifest.read().decode('utf-8') except IOError: return None

def save_manifest(self): payload = { 'paths': self.hashed_files, 'version': self.manifest_version, } if self._local_storage.exists(self.manifest_name): self._local_storage.delete(self.manifest_name) contents = json.dumps(payload).encode('utf-8') self._local_storage._save( self.manifest_name, ContentFile(contents) )

Page 68: Django Files — A Short Talk (slides only)

• #18929 CachedFilesMixin is not compatible with S3BotoStorage

• #19528 CachedFilesMixin does not rewrite rules for css selector with path

• #19670 CachedFilesMixin Doesn't Limit Substitutions to Extension Matches

• #20620 CachedFileMixin.post_process breaks when cache size is exceeded

• #21080 collectstatic post-processing fails for references inside comments

• #22353 CachedStaticFilesMixin lags in updating hashed names of other static files referenced in CSS

• #22972 HashedFilesMixin.patterns should limit URL matches to their respective filetypes

• #24243 Allow HashedFilesMixin to handle file name fragments

• #24452 Staticfiles backends using HashedFilesMixin don't update CSS files' hash when referenced media changes

• #25283 ManifestStaticFilesStorage does not works in edge cases whileimporting url font-face with IE hack

Page 69: Django Files — A Short Talk (slides only)

Some options for 3PA• Fiat: All 3PAs use staticfiles

• Shim: all 3PAs use the admin approach, wrapping it in their own taglib. Duplication!

• Bless: move staticfiles into core and make the simple `load static` the same as `load staticfiles`

• Weakly bless staticfiles so `load static` behaves like the admin, and the admin static stuff goes away, and everyone just uses `load static`

Page 70: Django Files — A Short Talk (slides only)

Form mediaHurtful assets

Page 71: Django Files — A Short Talk (slides only)

Widgets, Forms, Admin• Widgets might have specific assets to render

properly: typically CSS & JS

• Forms might have specific assets to render properly, too. They’re made out of widgets and some other bits, so they use the same system

• Then individual admin screens (ModelAdmin) might have specific assets as well; they have Forms which have Widgets

• Amazingly this hasn’t escaped into View

Page 72: Django Files — A Short Talk (slides only)

This is a good thingfrom django.contrib import adminfrom django.contrib.staticfiles.templatetags.staticfiles import static

class ArticleAdmin(admin.ModelAdmin):

# ...

class Media: js = [ '//tinymce.cachefly.net/4.1/tinymce.min.js', static('js/tinymce_setup.js'), ]

Page 73: Django Files — A Short Talk (slides only)

This isn’t a good thing

from django.contrib.staticfiles.templatetags.staticfiles import staticfrom django.contrib.admin.templatetags.admin_static import static

Page 74: Django Files — A Short Talk (slides only)

• #9357 Unable to subclass form Media class

• #12264 calendar.js depends on jsi18n but date widgets usingit do not specify as required media

• #12265 Media (js/css) collection strategy in Forms has no order dependence concept

• #13978 Allow inline js/css in forms.Media

• #18455 Added hooks to Media for staticfiles app

• #21221 Widgets and Admin's Media should use the configured staticfiles storage to create the right path to a file

• #21318 Clarify the ordering of the various Media classes

• #21987 Allow Media objects to have their own MEDIA_TYPES

• #22298 Rename Media to Static

Page 75: Django Files — A Short Talk (slides only)

Some options• Fiat: all 3PAs use staticfiles explicitly

• Shim: all 3PAs use the admin approach explicitly

• Bless: move staticfiles into core; Media.absolute_path uses its storage backend

• Weakly bless staticfiles, ie Media.absolute_path uses the admin trick

Page 76: Django Files — A Short Talk (slides only)

This world’s a mess anyway• no convenient API to get (CSS, JS) media

• can’t dedupe between forms if you have many on one page

Page 77: Django Files — A Short Talk (slides only)

This world’s a mess anyway• no convenient API to get (CSS, JS) media

• can’t dedupe between forms if you have many on one page

• some things are global, eg jQuery, and can’t easily dedupe between a widget and a site-wide library

• to say nothing of different versions

Page 78: Django Files — A Short Talk (slides only)

Asset pipelines

Page 79: Django Files — A Short Talk (slides only)

What? Why?

• compilation

Page 80: Django Files — A Short Talk (slides only)

What? Why?

• compilation

• concatenation

Page 81: Django Files — A Short Talk (slides only)

What? Why?

• compilation

• concatenation or linking/encapsulation

Page 82: Django Files — A Short Talk (slides only)

What? Why?

• compilation

• concatenation or linking/encapsulation

• minification

Page 83: Django Files — A Short Talk (slides only)

What? Why?

• compilation

• concatenation or linking/encapsulation

• minification

• hashing and caching

Page 84: Django Files — A Short Talk (slides only)

Writing the HTML

<!-- rendered.html --><script type='text/javascript' src='site.min.48fb66c7.js'><link rel='stylesheet' type='text/css' href=‘site.min.29557b4f.css'>

Page 85: Django Files — A Short Talk (slides only)

Focus on targets

<!-- external-syntax.html --><script type='text/javascript' src='{% static "site.js" %}'><link rel='stylesheet' type=‘text/css' href='{% static "site.css" %}'>

Page 86: Django Files — A Short Talk (slides only)

Focus on sources<!-- internal-syntax-1.html --><script type='text/javascript' src='menu.js'><script type='text/javascript' src='index.js'><link rel='stylesheet' type='text/css' href='nav.css'><link rel='stylesheet' type='text/css' href='footer.css'><link rel='stylesheet' type='text/css' href='index.css'>

<!-- internal-syntax-2.html -->{% asset js 'menu.js' %}{% asset js 'index.js' %}{% asset css 'nav.css' %}{% asset css 'footer.css' %}{% asset css 'index.css' %}

Page 87: Django Files — A Short Talk (slides only)

Some asset pipelines

Page 88: Django Files — A Short Talk (slides only)

Rails / Sprocket

<%= stylesheet_link_tag "application", media: "all" %><%= javascript_include_tag "application" %>

Page 89: Django Files — A Short Talk (slides only)

Rails / Sprocket

//= require home//= require moovinator//= require slider//= require phonebox

Page 90: Django Files — A Short Talk (slides only)

Rails / Sprocket

.class { background-image: url( <%= asset_path 'image.png' %> )}

Page 91: Django Files — A Short Talk (slides only)

Rails / Sprocket

rake assets:precompile

Page 92: Django Files — A Short Talk (slides only)

Rails / Sprocket

rake assets:clean

Page 93: Django Files — A Short Talk (slides only)

Sprocket clones

• asset-pipeline (Express/node.js)

• sails (node.js)

• grails asset pipeline (Groovy)

• Pipe (PHP)

Page 94: Django Files — A Short Talk (slides only)

node.js

• express-cdn

• Broccoli

• Sigh

• gulp

• Webpack

Page 95: Django Files — A Short Talk (slides only)

gulp (node.js)var gulp = require('gulp');var sass = require('gulp-sass');var sourcemaps = require('gulp-sourcemaps');var rev = require('gulp-rev');

gulp.task('default', ['compile-scss']);

gulp.task('compile-scss', function() { gulp.src('source/stylesheets/**/*.scss') .pipe(sourcemaps.init()) .pipe(sass( {indentedSyntax: false, errLogToConsole: true } )) .pipe(sourcemaps.write()) .pipe(rev()) .pipe(gulp.dest('static'));});

Page 96: Django Files — A Short Talk (slides only)

gulp (node.js)var gulp = require('gulp');var sass = require('gulp-sass');var sourcemaps = require('gulp-sourcemaps');var rev = require('gulp-rev');

gulp.task('default', ['compile-scss']);

gulp.task('compile-scss', function() { gulp.src('source/stylesheets/**/*.scss') .pipe(sourcemaps.init()) .pipe(sass( {indentedSyntax: false, errLogToConsole: true } )) .pipe(sourcemaps.write()) .pipe(rev()) .pipe(gulp.dest('static')) .pipe(rev.manifest()) .pipe(gulp.dest('static'));});

Page 97: Django Files — A Short Talk (slides only)

Django options

• Plain django (external pipeline + staticfiles)

• django-compressor

• django-pipeline

Page 98: Django Files — A Short Talk (slides only)

Plain Django

• Pipeline external to Django (use what you want)

• Hashes computed by staticfiles

• Sourcemap support is fiddly if you want hashes

Page 99: Django Files — A Short Talk (slides only)

django-compressor

• Integrated pipeline, supports precompilers &c

• Source files listed in templates

• Integrated hashing

• Can be used with staticfiles, but feels awkward

• Can support sourcemaps, via a plugin

Page 100: Django Files — A Short Talk (slides only)

django-pipeline

• Internal pipeline, supports precompilers &c

• Source to output mapping in Django settings

• Integrates with staticfiles better than compressor

• Hashing via staticfiles

• Doesn’t support sourcemaps directly

Page 101: Django Files — A Short Talk (slides only)

Django + webpack

• webpack-bundle-tracker + django-webpack-loader (Owais Lone 2015)

• Pipeline run by Webpack, emits a mapping file

• Template tag to resolve the bundle name to a URL relative to STATIC_ROOT

Page 102: Django Files — A Short Talk (slides only)

Django options

• django-compressor: fixed pipeline

• django-pipelines: fixed pipeline (+ config woes)

• staticfiles: doesn’t get hashes right

• webpack-loader: isn’t generic

Page 103: Django Files — A Short Talk (slides only)

The future?• pipeline builds named bundles into output files

• pipeline writes manifest.json: a mapping of bundle name to output filename

• staticfiles storage reads in manifest.json on boot

• templates refer to the bundle name

• useful for staticfiles to be able to list static directories (eg for node pipeline search paths)

Page 104: Django Files — A Short Talk (slides only)

Third-party apps

• can’t cooperate with your project’s pipeline

• don’t want to force a dependency on a pipeline

• so must precompile into files in your sdist

• possibly for staticfiles to sweep up (but we’ve discussed this bit before)

Page 105: Django Files — A Short Talk (slides only)

What next?

Page 106: Django Files — A Short Talk (slides only)

What next?• bless or semi-bless staticfiles?

• deprecate CachedStaticFilesStorage?

• document the boundaries of our hashing?

• rename? kill? expand? form.Media

• asset management / document external pipelines

• fix some bugs ;-)

Page 107: Django Files — A Short Talk (slides only)

• #9433 File locking broken on AFP mounts

• #17686 file.save crashes on unicode filename

• #18233 file_move_safe overwrites destination file

• #18655 Media files should be served using file storage API

• #22961 StaticFilesHandler should not run middleware on 404

Page 108: Django Files — A Short Talk (slides only)
Page 109: Django Files — A Short Talk (slides only)

😨🐉

• Durbed (durbed.deviantart.com) under CC By-SA 3.0

• “Happy New Year from Hell Creek”

• “Primal feathers”

Page 110: Django Files — A Short Talk (slides only)

James Aylett@jaylett