Django Files — A Short Talk (slides only)
-
Upload
james-aylett -
Category
Software
-
view
375 -
download
3
Transcript of Django Files — A Short Talk (slides only)
Files@jaylett
Filesin an exciting adventure with dinosaurs
Files
in an exciting adventure with dinosaurs
Filesa brief talk
TZ=CET ls -ltr talk/-rwxr--r-- 1 jaylett django 6 6 Nov 14:01 Files
-rwxr--r-- 1 jaylett django 5 6 Nov 14:02 Files and HTTP
-rwxr--r-- 1 jaylett django 15 6 Nov 14:04 Files in the ORM
-rwxr--r-- 1 jaylett django 13 6 Nov 14:08 Storage backends
-rwxr--r-- 1 jaylett django 25 6 Nov 14:20 Static files
-rwxr--r-- 1 jaylett django 8 6 Nov 14:35 Form media
-rwxr--r-- 1 jaylett django 29 6 Nov 14:40 Asset pipelines
-rwxr--r-- 1 jaylett django 6 6 Nov 14:55 What next?
Files
Python files
Django files
Django files
Awesome!
The File family
• File — or ImageFile, if it might be an image
• ContentFile / SimpleUploadedFile in tests
• which have a different parameter order
• #10541 cannot save file from a pipe
Files and HTTP
UploadedFile
• “behaves somewhat like a file object”
• temporary file and memory variants
• custom upload handlers
forms.FileField# forms-filefield.py
class FileForm(forms.Form): uploaded = forms.FileField()
def upload(request): if request.method == 'POST': form = FileForm(request.POST, request.FILES) if form.is_valid(): request.FILES['uploaded'] # do something! return HttpResponseRedirect('/next/') else: form = FileForm() return render_to_response( 'upload.html', {'form': form}, )
Again, it works
• #15879 multipart/form-data filename="" not handled as file
• #17955 Uploading a file without using django forms
• #18150 Uploading a file ending with abackslash fails
• #20034 Upload handlers provide no way to retrieve previously parsed POST variables
• #21588 "Modifying upload handlers on the fly" documentation doesn't replicate internal magic
Files in the ORM
Files in the ORM# orm-file.py
class Wub(models.Model): infosheet = models.FileField()
>>> w = Wub(infosheet="relative/to/media/root.pdf")>>> print w.infosheet.urlhttps://media.root/relative/to/media/root.pdf>>> w.infosheet = ContentFile("A boring bit of text", “file.txt")>>> print w.infosheet.urlhttps://media.root/file.txt
• #5619 FileField and ImageField return the wrong path/urlbefore calling save_FOO_file()
• #10244 FileFields can't be set to NULL in the db
• #13809 FileField open method is only accepting 'rb' modes
• #14039 FileField special-casing breaks MultiValueField including a FileField
• #13327 FileField/ImageField accessor methods throw unnecessary exceptions when they are blank or null.
• #17224 determine and document the use of default option in context of FileField
• #25547 refresh_from_db leaves FieldFile with reference to db_instance
Files in the ORM# orm-file.py
class Wub(models.Model): infosheet = models.FileField()
>>> w = Wub(infosheet="relative/to/media/root.pdf")>>> print w.infosheet.urlhttp://media.root/relative/to/media/root.pdf>>> w.infosheet = ContentFile("A boring bit of text", “file.txt")>>> print w.infosheet.urlhttp://media.root/file.txt>>> w.infosheet = None>>> print w.infosheet.url
Files in the ORM# orm-file.py
class Wub(models.Model): infosheet = models.FileField()
>>> w = Wub(infosheet="relative/to/media/root.pdf")>>> print w.infosheet.urlhttp://media.root/relative/to/media/root.pdf>>> w.infosheet = ContentFile("A boring bit of text", “file.txt")>>> print w.infosheet.urlhttp://media.root/file.txt>>> w.infosheet = None>>> print w.infosheet.url>>> print type(w.infosheet)FieldFile <class 'django.db.models.fields.files.FieldFile'>
FieldFile
• magical autoconversion for anything (within reason)
• this happens using FileDescriptor classes which, well, let’s just ignore that
FileField# orm-fieldfile.py
class Wub(models.Model): infosheet = models.FileField()
>>> w = Wub(infosheet="relative/to/media/root.pdf")>>> w.infosheet = None>>> w.infosheet == NoneTrue>>> w.infosheet is NoneFalse
• #18283 FileField should not reuse FieldFiles
In ModelForms
# modelforms-filefield.py
class Wub(models.Model): infosheet = models.FileField()
class WubCreate(CreateView): model = Wub fields = ['infosheet']
ImageField# orm-imagefile.py
class Wub(models.Model): infosheet = models.FileField() photo = models.ImageField()
>>> w = Wub(infosheet="relative/to/media/root.pdf", photo="relative/to/media/root.png")>>> w.photo.width, w.photo.height(480, 200)
• #15817 ImageField having[width|height]_field set sytematically compute the image dimensions in ModelForm validation process
• #18543 Non image file can be saved to ImageField
• #19215 ImageField's “Currently” and “Clear” Sometimes Don't Appear
• #21548 Add the ability to limit file extensions for ImageField and FileField
So many classes
So many classes
ImageField# orm-imagefile-proxies.py
class Wub(models.Model): infosheet = models.FileField() photo = models.ImageField(width_field='photo_width', height_field='photo_height') photo_width = models.PositiveIntegerField(blank=True) photo_height = models.PositiveIntegerField(blank=True)
>>> w = Wub(infosheet="relative/to/media/root.pdf", photo="relative/to/media/root.png")>>> w.photo.width, w.photo.height(480, 200)>>> w.photo_width, w.photo_height(480, 200)
• #8307 ImageFile use of width_field and height_field is slow with remote storage backends
• #13750 ImageField accessing height or width and then data results in "I/O operation on closed file”
Storage backends
Storing files
• Stored in MEDIA_ROOT
• Served from MEDIA_URL
• Uses FileSystemStorage…by default
Another abstraction
What can Storage do?
• oriented around files, rather than a general FS API
• open / save / delete / exists / size / path / url
• make an available name, ie one that doesn’t clash
• modified, created, accessed times
• … also listdir
Choosing storage# configuring-storage-settings.pyDEFAULT_FILE_STORAGE = 'dotted.path.Class'MEDIA_ROOT = '/my-root'MEDIA_URL = 'https://media.root/'
class MyStorage(FileSystemStorage): def __init__(self, **kwargs): kwargs.setdefault( 'location', '/my-root' ) kwargs.setdefault( 'base_url', 'https://media.root/' ) return super( MyStorage, self ).__init__(**kwargs)
Choosing storage
# models.pyfrom mystorage import BetterStoragestorage_instance = BetterStorage()
class MyModel(models.Model): upload = models.FileField( storage=storage_instance )
• #9586 Shall upload_to return an urlencodedstring or not?
• #12157 FileSystemStorage does file I/O inefficiently, despite providing options to permit larger blocksizes
• #15799 Document what exception should be raised when trying to open non-existent file
• #21602 FileSystemStorage._save() Should Save to a Temporary Filename and Rename to Attempt to be Atomic
• #23759 Storage.get_available_name should preserve all file extensions, not just the first one
• #23832 Storage API should provide a timezone aware approach
Reasons to override
• Model fields with different options
• Different storage engine entirely
Protected storageFSS = FileSystemStorage
pstore = FSS( location=‘/protected’, base_url="/p/",)
urlpatterns += patterns( url( r'^p/(?P<path>.*)$', protected, ),)
class Profile(models.Model): resume = FileField( null=True, blank=True, storage=pstore, )
@login_requireddef protected(request, path): f = pstore.open(path) return HttpResponse( f.chunks() )
S3BotoStorage
• One of many in django-storages-redux
• Millions of options, somewhat undocumented
• configure with AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_STORAGE_BUCKET_NAME
• defaults for the rest mostly sane
Fun with S3Boto
AWS_S3_CUSTOM_DOMAIN='cdn.eg.com'AWS_HEADERS={ 'Cache-Control': 'max-age=31536000'} # a year or soAWS_PRELOAD_METADATA=True
Protected S3 storageprotected_storage = S3BotoStorage( acl='private', querystring_auth=True, querystring_expire=600, # 10 minutes, try to ensure people # won’t / can't share)
model.field.url # works as expected
Code that uses storage
• Please test on both Windows and Linux/Unix
• Please test with something remote like S3Boto
• Please write your own tests to work with different storage backends
Static filesHelpful assets
staticfiles in 1.3
How it works<!-- source.html -->{% load static %}<img src='{% static "asset/path.png" %}' width='200' height='100' alt=''>
<!-- out.html --><img src='https://static.root/asset/path.png' width='200' height='100' alt=‘'>
In development
collectstatic
• #24336 static server should skip for protocol-relative STATIC_URL
• #25022 collectstatic create self-referential symlink
staticfiles in 1.4
staticfiles in 1.4
• #23563 Make `staticfiles_storage` a public API
• #25484 static template tag outputs invalid HTML if storage class's url() returns a URI with '&' characters.
I make it sound badReally, it’s not
Still a moderately bad time
Still a moderately bad time
Serving assets best practice
Serving assets best practice
Serving assets best practice
Cached (1.4) / Manifest (1.7)
• Cache uses the main cache, or a dedicated one
• Manifest writes a JSON manifest file to the configured storage backend
The problem
<!-- hashed-assets.html --><script src=“/static/admin/js/ collapse.min.c1a27df1b997.js”><script src=“/admin/editorial/ article/2/autosave_variables.js"><script src=“/static/autosave/js/ autosave.js”>
The problem
<!-- hashed-assets.html --><script src=“/static/admin/js/ collapse.min.c1a27df1b997.js”><script src=“/admin/editorial/ article/2/autosave_variables.js"><script src=“/static/autosave/js/ autosave.js?v=2”>
That’s not all
That’s not all
That’s not all
That’s not all
Manifests for artefactsclass LocalManifestMixin(ManifestFilesMixin): _local_storage = None
def __init__(self, *args, **kwargs): super(LocalManifestMixin, self).__init__(*args, **kwargs) self._local_storage = FileSystemStorage()
def read_manifest(self): try: with self._local_storage.open(self.manifest_name) as manifest: return manifest.read().decode('utf-8') except IOError: return None
def save_manifest(self): payload = { 'paths': self.hashed_files, 'version': self.manifest_version, } if self._local_storage.exists(self.manifest_name): self._local_storage.delete(self.manifest_name) contents = json.dumps(payload).encode('utf-8') self._local_storage._save( self.manifest_name, ContentFile(contents) )
• #18929 CachedFilesMixin is not compatible with S3BotoStorage
• #19528 CachedFilesMixin does not rewrite rules for css selector with path
• #19670 CachedFilesMixin Doesn't Limit Substitutions to Extension Matches
• #20620 CachedFileMixin.post_process breaks when cache size is exceeded
• #21080 collectstatic post-processing fails for references inside comments
• #22353 CachedStaticFilesMixin lags in updating hashed names of other static files referenced in CSS
• #22972 HashedFilesMixin.patterns should limit URL matches to their respective filetypes
• #24243 Allow HashedFilesMixin to handle file name fragments
• #24452 Staticfiles backends using HashedFilesMixin don't update CSS files' hash when referenced media changes
• #25283 ManifestStaticFilesStorage does not works in edge cases whileimporting url font-face with IE hack
Some options for 3PA• Fiat: All 3PAs use staticfiles
• Shim: all 3PAs use the admin approach, wrapping it in their own taglib. Duplication!
• Bless: move staticfiles into core and make the simple `load static` the same as `load staticfiles`
• Weakly bless staticfiles so `load static` behaves like the admin, and the admin static stuff goes away, and everyone just uses `load static`
Form mediaHurtful assets
Widgets, Forms, Admin• Widgets might have specific assets to render
properly: typically CSS & JS
• Forms might have specific assets to render properly, too. They’re made out of widgets and some other bits, so they use the same system
• Then individual admin screens (ModelAdmin) might have specific assets as well; they have Forms which have Widgets
• Amazingly this hasn’t escaped into View
This is a good thingfrom django.contrib import adminfrom django.contrib.staticfiles.templatetags.staticfiles import static
class ArticleAdmin(admin.ModelAdmin):
# ...
class Media: js = [ '//tinymce.cachefly.net/4.1/tinymce.min.js', static('js/tinymce_setup.js'), ]
This isn’t a good thing
from django.contrib.staticfiles.templatetags.staticfiles import staticfrom django.contrib.admin.templatetags.admin_static import static
• #9357 Unable to subclass form Media class
• #12264 calendar.js depends on jsi18n but date widgets usingit do not specify as required media
• #12265 Media (js/css) collection strategy in Forms has no order dependence concept
• #13978 Allow inline js/css in forms.Media
• #18455 Added hooks to Media for staticfiles app
• #21221 Widgets and Admin's Media should use the configured staticfiles storage to create the right path to a file
• #21318 Clarify the ordering of the various Media classes
• #21987 Allow Media objects to have their own MEDIA_TYPES
• #22298 Rename Media to Static
Some options• Fiat: all 3PAs use staticfiles explicitly
• Shim: all 3PAs use the admin approach explicitly
• Bless: move staticfiles into core; Media.absolute_path uses its storage backend
• Weakly bless staticfiles, ie Media.absolute_path uses the admin trick
This world’s a mess anyway• no convenient API to get (CSS, JS) media
• can’t dedupe between forms if you have many on one page
This world’s a mess anyway• no convenient API to get (CSS, JS) media
• can’t dedupe between forms if you have many on one page
• some things are global, eg jQuery, and can’t easily dedupe between a widget and a site-wide library
• to say nothing of different versions
Asset pipelines
What? Why?
• compilation
What? Why?
• compilation
• concatenation
What? Why?
• compilation
• concatenation or linking/encapsulation
What? Why?
• compilation
• concatenation or linking/encapsulation
• minification
What? Why?
• compilation
• concatenation or linking/encapsulation
• minification
• hashing and caching
Writing the HTML
<!-- rendered.html --><script type='text/javascript' src='site.min.48fb66c7.js'><link rel='stylesheet' type='text/css' href=‘site.min.29557b4f.css'>
Focus on targets
<!-- external-syntax.html --><script type='text/javascript' src='{% static "site.js" %}'><link rel='stylesheet' type=‘text/css' href='{% static "site.css" %}'>
Focus on sources<!-- internal-syntax-1.html --><script type='text/javascript' src='menu.js'><script type='text/javascript' src='index.js'><link rel='stylesheet' type='text/css' href='nav.css'><link rel='stylesheet' type='text/css' href='footer.css'><link rel='stylesheet' type='text/css' href='index.css'>
<!-- internal-syntax-2.html -->{% asset js 'menu.js' %}{% asset js 'index.js' %}{% asset css 'nav.css' %}{% asset css 'footer.css' %}{% asset css 'index.css' %}
Some asset pipelines
Rails / Sprocket
<%= stylesheet_link_tag "application", media: "all" %><%= javascript_include_tag "application" %>
Rails / Sprocket
//= require home//= require moovinator//= require slider//= require phonebox
Rails / Sprocket
.class { background-image: url( <%= asset_path 'image.png' %> )}
Rails / Sprocket
rake assets:precompile
Rails / Sprocket
rake assets:clean
Sprocket clones
• asset-pipeline (Express/node.js)
• sails (node.js)
• grails asset pipeline (Groovy)
• Pipe (PHP)
node.js
• express-cdn
• Broccoli
• Sigh
• gulp
• Webpack
gulp (node.js)var gulp = require('gulp');var sass = require('gulp-sass');var sourcemaps = require('gulp-sourcemaps');var rev = require('gulp-rev');
gulp.task('default', ['compile-scss']);
gulp.task('compile-scss', function() { gulp.src('source/stylesheets/**/*.scss') .pipe(sourcemaps.init()) .pipe(sass( {indentedSyntax: false, errLogToConsole: true } )) .pipe(sourcemaps.write()) .pipe(rev()) .pipe(gulp.dest('static'));});
gulp (node.js)var gulp = require('gulp');var sass = require('gulp-sass');var sourcemaps = require('gulp-sourcemaps');var rev = require('gulp-rev');
gulp.task('default', ['compile-scss']);
gulp.task('compile-scss', function() { gulp.src('source/stylesheets/**/*.scss') .pipe(sourcemaps.init()) .pipe(sass( {indentedSyntax: false, errLogToConsole: true } )) .pipe(sourcemaps.write()) .pipe(rev()) .pipe(gulp.dest('static')) .pipe(rev.manifest()) .pipe(gulp.dest('static'));});
Django options
• Plain django (external pipeline + staticfiles)
• django-compressor
• django-pipeline
Plain Django
• Pipeline external to Django (use what you want)
• Hashes computed by staticfiles
• Sourcemap support is fiddly if you want hashes
django-compressor
• Integrated pipeline, supports precompilers &c
• Source files listed in templates
• Integrated hashing
• Can be used with staticfiles, but feels awkward
• Can support sourcemaps, via a plugin
django-pipeline
• Internal pipeline, supports precompilers &c
• Source to output mapping in Django settings
• Integrates with staticfiles better than compressor
• Hashing via staticfiles
• Doesn’t support sourcemaps directly
Django + webpack
• webpack-bundle-tracker + django-webpack-loader (Owais Lone 2015)
• Pipeline run by Webpack, emits a mapping file
• Template tag to resolve the bundle name to a URL relative to STATIC_ROOT
Django options
• django-compressor: fixed pipeline
• django-pipelines: fixed pipeline (+ config woes)
• staticfiles: doesn’t get hashes right
• webpack-loader: isn’t generic
The future?• pipeline builds named bundles into output files
• pipeline writes manifest.json: a mapping of bundle name to output filename
• staticfiles storage reads in manifest.json on boot
• templates refer to the bundle name
• useful for staticfiles to be able to list static directories (eg for node pipeline search paths)
Third-party apps
• can’t cooperate with your project’s pipeline
• don’t want to force a dependency on a pipeline
• so must precompile into files in your sdist
• possibly for staticfiles to sweep up (but we’ve discussed this bit before)
What next?
What next?• bless or semi-bless staticfiles?
• deprecate CachedStaticFilesStorage?
• document the boundaries of our hashing?
• rename? kill? expand? form.Media
• asset management / document external pipelines
• fix some bugs ;-)
• #9433 File locking broken on AFP mounts
• #17686 file.save crashes on unicode filename
• #18233 file_move_safe overwrites destination file
• #18655 Media files should be served using file storage API
• #22961 StaticFilesHandler should not run middleware on 404
😨🐉
• Durbed (durbed.deviantart.com) under CC By-SA 3.0
• “Happy New Year from Hell Creek”
• “Primal feathers”
James Aylett@jaylett