Php internal architecture

52
PHP Internal Architecture Pluggable, Extendable, Useable

Transcript of Php internal architecture

Page 1: Php internal architecture

PHP Internal ArchitecturePluggable, Extendable, Useable

Page 2: Php internal architecture

ArchitecturePHP piece by piece

Page 3: Php internal architecture

You should know the basics

Page 4: Php internal architecture

All the puzzle pieces

PHPInput/Output• SAPI• Streams

Engine• Lexer• Parser• AST• Compiler• Executor

Extensions• Zend Extensions• Compiled In• Loaded at startup• Loaded at runtime

Page 5: Php internal architecture

Running PHP

server makes request

SAPI talks to engine

engine runsSAPI returns

output to server

Page 6: Php internal architecture

How other languages do this

Python (Cpython)• mod_python (embedded

python interpreter, deprecated)

• mod_wsgi (embedded or daemon) – basically a mod_python copy OR speaking to python interpreter with a special library installed via unix sockets)

• command line interpreter• Fastcgi/cgi (using a library in

python)

Ruby (MRI)• also known as “CRuby”• Matz’s Ruby Interpreter

• use Rack (library) to:• write/run a ruby webserver • use another server in between

with hooks to nginx/apache (unicorn, passenger)

• use FastCgi/Cgi

Page 7: Php internal architecture

And still more..

NodeJS• Your app is your server

• This is a pain• Write your own clustering or

other neat features!!

• So you stick a process manager in front

• And you reverse proxy from apache/nginx

• Or you use passenger or some other server….

Perl• Yes it still exists – shhh you in

the back

• PSGI + plack • mod_perl• mod_psgi

Page 8: Php internal architecture

What makes PHP different?• Shared nothing architecture by design

• application lifecycle is per-request• no shared state natively• infinite horizontal scalability in the language itself

• HTTP is a first class citizen• You don’t need a library or framework

• SAPI is a first class citizen• Designed to have a server in front of it• No library necessary

• You don’t need a deployment tool to keep it all going

Page 9: Php internal architecture

The answer to your question is

Page 10: Php internal architecture

SAPIServer API – the least understood feature in PHP

Page 11: Php internal architecture

What is a SAPI?• Tells a Server how to talk to PHP via an API

• Server API• Server Application Programming Interface

• “Server” is a bit broad as it means any type of Input/Output mechanism

• SAPIS do:• input arguments• output, flushing, file descriptors, interruptions, system user info• input filtering and optionally headers, POST data, HTTP specific stuff• Handling a stream for the request body

Page 12: Php internal architecture

In the beginning• CGI

• Common gateway interface• Shim between web server and

program

• Simple• Stateless• Slow• Local• Good security with linux tools

• Slow• Local• Programs can have too much

access• Memory use not transparent

(thrash and die!)

Page 13: Php internal architecture

Then there was PHP in a Webserver• mod_php (apache2handler)

• Run the language directly in the webserver, speaking to a webserver’s module api

• Can access all of apache’s stuff

• Webserver handles all the request stuff, no additional sockets/processes

• It works well

• Requires prefork MPM or thread safe PHP

• Eats all your memories and never lets the system have it back

• Makes apache children take more memory

Page 14: Php internal architecture

CGI is slow: FastCGI to the rescue!• Persistent processes but CGI mad style• Biggest drawbacks?

• “it’s old”• “I don’t like the protocol”• “it’s not maintained”• “other people say it’s not stable”

• Apache fcgi modules do kind of suck • Nginx “just works”• IIS8+ “just works”

Page 15: Php internal architecture

php-fpm – Make FastCGI better • FastCGI Process Manager• Adds more features than traditional FastCGI

• Better process management including graceful stop/start• Uid/gid/chroot/environment/port/ini configuration per worker• Better logging• Emergency restart• Accelerated upload support• Dynamic/static child spawning

Page 16: Php internal architecture

CLI?• Yes, in PHP the CLI is a SAPI• (Did you know there’s a special windows cli that doesn’t pop a

console window?)• PHP “overloads” the CLI to have a command line webserver for

easier development (even though it SHOULD be on its own) • PHP did that because fighting with distros to always include the cli-

server would have meant pain, and if you just grab php.exe the dev webserver is always available

• The CLI treats console STDIN/STDOUT as its request/response

Page 17: Php internal architecture

php-embed• A thin wrapper allowing PHP to be easily embedded via C• Used for extensions in node, python, ruby, and perl to interact with

PHP• Corresponding extensions do exist for those languages embedded in

PHP

Page 18: Php internal architecture

phpdbg• Wait – there’s a debugger SAPI?• Yes, yes there is

Page 19: Php internal architecture

litespeed• It is a SAPI• The server just went open source…• I’ve never tried it, but they take care of the SAPI

Page 20: Php internal architecture

Just connect to the app?• Use a webserver to reverse proxy to webserver built into a

framework?

• Smart to use a webserver that has already solved the hard stuff• But the app/web framework on top needs to deal with

• HTTP keepalive?• Gzip with caching?• X-forwarded-for? Logging? Issues• Load balancing and failover?• HTTPS and caching?• ulimit? Remember we’re opening up a bunch of sockets!

Page 21: Php internal architecture

Well, PHP streams can do that

Page 22: Php internal architecture

StreamsInput and Output beyond the SAPI

Page 23: Php internal architecture

What is a Stream?• Access input and output generically• Can write and read linearly• May or may not be seekable• Comes in chunks of data

Page 24: Php internal architecture

How PHP Streams Work

Stream Contexts

Stream Wrapper

Stream FilterALL IO

Page 25: Php internal architecture

Definitions• Socket

• Bidirectional network stream that speaks a protocol

• Transport• Tells a network stream how to communicate

• Wrapper• Tells a stream how to handle specific protocols and encodings

Page 26: Php internal architecture

Built in Socket Transports• tcp• udp• unix• udg• SSL extension

• ssl• sslv2• sslv3• tls

Page 27: Php internal architecture

You can write your own streams!• You can do a stream wrapper in userland and register it• But you need an extension to register them if they have a transport• Extensions with streams include ssh, bzip2, openssl• I’d really like the curl stream back (not with the compile flag, but

curl://)

Page 28: Php internal architecture

Welcome to the EngineLexers and Parsers and Opcodes OH MY!

Page 29: Php internal architecture

Lexer• checks PHP’s spelling• turns into tokens• see token_get_all for what PHP sees

Page 30: Php internal architecture

Parser + AST• checks PHP’s grammar• E_PARSE means “bad phpish”• creates AST

Page 31: Php internal architecture

Compiler• Turns AST into Opcodes• Allows for fancier grammar• Opcodes can then be cached (opcache) skipping lex/parse/compile

cycle

Page 32: Php internal architecture

Opcodes• dump with http://derickrethans.nl/projects.html• machine readable language which the runtime understands

Page 33: Php internal architecture

Engine (Virtual Machine)• reads opcode• does something• zend extension can hook it!• ???• PROFIT

Page 34: Php internal architecture

ExtensionsHow a simple design pattern made PHP more useful

Page 35: Php internal architecture

“When I say that PHP is a ball of nails, basically, PHP is just this piece of shit that you just put all the parts together and you throw it against the wall and it fucking sticks”- Terry Chay

Page 36: Php internal architecture

So what is an extension?• Written in C or C++• Compiled statically into the PHP binary or as a shared object

(so/dylib/dll)• Provides

• Bindings to a C or C++ library• even embed other languages

• Code in C instead of PHP (speed)• template engine

• Alter engine functionality • debugging

Page 37: Php internal architecture

So why an extension?• add functionality from other languages (mainly C)• speed• to infinity and beyond!

• intercept the engine• add debugging• add threading capability• the impossible (see: operator)

Page 38: Php internal architecture

About Extensions• Types

• Zend Extension• PHP Module

• Sources• Core Built in• Core Default• Core• PECL• Github and Other 3rd Party

Page 39: Php internal architecture

– “We need to foster a greater sense of community for people writing PHP extensions, […] Quite what this means hasn't been decided, although one of the major responsibilities is to spark up some community spirit, and that is the purpose of this email.”

- Wez Furlong, 2003

Page 40: Php internal architecture

What is PECL?• PHP Extension Code Library• The place for people to find PHP extensions• No GPL code – license should be PHP license compatible (LGPL

is ok but not encouraged)• http://news.php.net/article.php?group=php.pecl.dev&article=5

Page 41: Php internal architecture

PECL Advantages• Code reviews

• See https://wiki.php.net/internals/review_comments

• Help from other devs with internal API changes (if in PHP source control)• https://svn.php.net/viewvc?view=revision&revision=297236

• Advertising and individual release cycles• http://pecl.php.net/news/

• pecl command line integration• actually just integration with PEAR installer (which support

binaries/compiling) and unique pecl channel

• php.net documentation!

Page 42: Php internal architecture

PECL Problems• Has less oversight into code quality

• peclqa?• not all source accessible

• no action taken for abandoned code• still has “siberia” modules mixed with “need a maintainer”

• never enough help • tests• bug triaging• maintainers• code reviews• docs!

• no composer integration• Half the code in git, half in svn still, half… elsewhere …

Page 43: Php internal architecture

“It’s really free as in pull request”- me

Page 44: Php internal architecture

My extension didn’t make it faster!• PHP is usually not the real bottleneck• Do full stack profiling and benchmarking to see if PHP is the real

bottleneck• If PHP IS the real bottleneck you’re awesome – and you need to be

writing stuff in C or C++• Most times your bottleneck is not PHP but I/O

Page 45: Php internal architecture

What about other languages?• Ruby gem

• Will compile and install

• Node’s npm• Will compile and install

• Perl’s CPAN• Written in special “xs” language• Will compile and install

• Python• Mixed bag? Distutils can install or grab a binary

Page 46: Php internal architecture

FFITalk C without compiling

Page 47: Php internal architecture

What is FFI?• Foreign Function Interface• Most things written in C use libffi• https://github.com/libffi/libffi

Page 48: Php internal architecture

Who has FFI?• Java calls it JNI• HHVM calls it HNI• Python calls it “ctypes” (do not ask, stupidest name ever)• C# calls it P/Invoke• Ruby calls it FFI• Perl has Inline::C (a bit of a mess)• PHP calls it…

Page 49: Php internal architecture

FFI

Page 50: Php internal architecture

Oh wait…• PHP’s FFI is rather broken• PHP’s FFI has no maintainer• It needs some TLC• There’s MFFI but it’s not done

• https://github.com/mgdm/MFFI

• Are you interested and not afraid?

Page 51: Php internal architecture

For the future?• More SAPIs?

• Websockets• PSR-7• Other ideas?

• Fix server-tests.php so we can test SAPIs • Only CGI and CLI are currently tested well

• More extensions• Guidelines for extensions• Better documentation• Builds + pickle + composer integration

Page 52: Php internal architecture

About Me http://emsmith.net [email protected] twitter - @auroraeosrose IRC – freenode –

auroraeosrose #phpmentoring https://joind.in/talk/67433