Php internal architecture
-
Upload
elizabeth-smith -
Category
Software
-
view
217 -
download
0
Transcript of Php internal architecture
PHP Internal ArchitecturePluggable, Extendable, Useable
ArchitecturePHP piece by piece
You should know the basics
All the puzzle pieces
PHPInput/Output• SAPI• Streams
Engine• Lexer• Parser• AST• Compiler• Executor
Extensions• Zend Extensions• Compiled In• Loaded at startup• Loaded at runtime
Running PHP
server makes request
SAPI talks to engine
engine runsSAPI returns
output to server
How other languages do this
Python (Cpython)• mod_python (embedded
python interpreter, deprecated)
• mod_wsgi (embedded or daemon) – basically a mod_python copy OR speaking to python interpreter with a special library installed via unix sockets)
• command line interpreter• Fastcgi/cgi (using a library in
python)
Ruby (MRI)• also known as “CRuby”• Matz’s Ruby Interpreter
• use Rack (library) to:• write/run a ruby webserver • use another server in between
with hooks to nginx/apache (unicorn, passenger)
• use FastCgi/Cgi
And still more..
NodeJS• Your app is your server
• This is a pain• Write your own clustering or
other neat features!!
• So you stick a process manager in front
• And you reverse proxy from apache/nginx
• Or you use passenger or some other server….
Perl• Yes it still exists – shhh you in
the back
• PSGI + plack • mod_perl• mod_psgi
What makes PHP different?• Shared nothing architecture by design
• application lifecycle is per-request• no shared state natively• infinite horizontal scalability in the language itself
• HTTP is a first class citizen• You don’t need a library or framework
• SAPI is a first class citizen• Designed to have a server in front of it• No library necessary
• You don’t need a deployment tool to keep it all going
The answer to your question is
SAPIServer API – the least understood feature in PHP
What is a SAPI?• Tells a Server how to talk to PHP via an API
• Server API• Server Application Programming Interface
• “Server” is a bit broad as it means any type of Input/Output mechanism
• SAPIS do:• input arguments• output, flushing, file descriptors, interruptions, system user info• input filtering and optionally headers, POST data, HTTP specific stuff• Handling a stream for the request body
In the beginning• CGI
• Common gateway interface• Shim between web server and
program
• Simple• Stateless• Slow• Local• Good security with linux tools
• Slow• Local• Programs can have too much
access• Memory use not transparent
(thrash and die!)
Then there was PHP in a Webserver• mod_php (apache2handler)
• Run the language directly in the webserver, speaking to a webserver’s module api
• Can access all of apache’s stuff
• Webserver handles all the request stuff, no additional sockets/processes
• It works well
• Requires prefork MPM or thread safe PHP
• Eats all your memories and never lets the system have it back
• Makes apache children take more memory
CGI is slow: FastCGI to the rescue!• Persistent processes but CGI mad style• Biggest drawbacks?
• “it’s old”• “I don’t like the protocol”• “it’s not maintained”• “other people say it’s not stable”
• Apache fcgi modules do kind of suck • Nginx “just works”• IIS8+ “just works”
php-fpm – Make FastCGI better • FastCGI Process Manager• Adds more features than traditional FastCGI
• Better process management including graceful stop/start• Uid/gid/chroot/environment/port/ini configuration per worker• Better logging• Emergency restart• Accelerated upload support• Dynamic/static child spawning
CLI?• Yes, in PHP the CLI is a SAPI• (Did you know there’s a special windows cli that doesn’t pop a
console window?)• PHP “overloads” the CLI to have a command line webserver for
easier development (even though it SHOULD be on its own) • PHP did that because fighting with distros to always include the cli-
server would have meant pain, and if you just grab php.exe the dev webserver is always available
• The CLI treats console STDIN/STDOUT as its request/response
php-embed• A thin wrapper allowing PHP to be easily embedded via C• Used for extensions in node, python, ruby, and perl to interact with
PHP• Corresponding extensions do exist for those languages embedded in
PHP
phpdbg• Wait – there’s a debugger SAPI?• Yes, yes there is
litespeed• It is a SAPI• The server just went open source…• I’ve never tried it, but they take care of the SAPI
Just connect to the app?• Use a webserver to reverse proxy to webserver built into a
framework?
• Smart to use a webserver that has already solved the hard stuff• But the app/web framework on top needs to deal with
• HTTP keepalive?• Gzip with caching?• X-forwarded-for? Logging? Issues• Load balancing and failover?• HTTPS and caching?• ulimit? Remember we’re opening up a bunch of sockets!
Well, PHP streams can do that
StreamsInput and Output beyond the SAPI
What is a Stream?• Access input and output generically• Can write and read linearly• May or may not be seekable• Comes in chunks of data
How PHP Streams Work
Stream Contexts
Stream Wrapper
Stream FilterALL IO
Definitions• Socket
• Bidirectional network stream that speaks a protocol
• Transport• Tells a network stream how to communicate
• Wrapper• Tells a stream how to handle specific protocols and encodings
Built in Socket Transports• tcp• udp• unix• udg• SSL extension
• ssl• sslv2• sslv3• tls
You can write your own streams!• You can do a stream wrapper in userland and register it• But you need an extension to register them if they have a transport• Extensions with streams include ssh, bzip2, openssl• I’d really like the curl stream back (not with the compile flag, but
curl://)
Welcome to the EngineLexers and Parsers and Opcodes OH MY!
Lexer• checks PHP’s spelling• turns into tokens• see token_get_all for what PHP sees
Parser + AST• checks PHP’s grammar• E_PARSE means “bad phpish”• creates AST
Compiler• Turns AST into Opcodes• Allows for fancier grammar• Opcodes can then be cached (opcache) skipping lex/parse/compile
cycle
Opcodes• dump with http://derickrethans.nl/projects.html• machine readable language which the runtime understands
Engine (Virtual Machine)• reads opcode• does something• zend extension can hook it!• ???• PROFIT
ExtensionsHow a simple design pattern made PHP more useful
“When I say that PHP is a ball of nails, basically, PHP is just this piece of shit that you just put all the parts together and you throw it against the wall and it fucking sticks”- Terry Chay
So what is an extension?• Written in C or C++• Compiled statically into the PHP binary or as a shared object
(so/dylib/dll)• Provides
• Bindings to a C or C++ library• even embed other languages
• Code in C instead of PHP (speed)• template engine
• Alter engine functionality • debugging
So why an extension?• add functionality from other languages (mainly C)• speed• to infinity and beyond!
• intercept the engine• add debugging• add threading capability• the impossible (see: operator)
About Extensions• Types
• Zend Extension• PHP Module
• Sources• Core Built in• Core Default• Core• PECL• Github and Other 3rd Party
– “We need to foster a greater sense of community for people writing PHP extensions, […] Quite what this means hasn't been decided, although one of the major responsibilities is to spark up some community spirit, and that is the purpose of this email.”
- Wez Furlong, 2003
What is PECL?• PHP Extension Code Library• The place for people to find PHP extensions• No GPL code – license should be PHP license compatible (LGPL
is ok but not encouraged)• http://news.php.net/article.php?group=php.pecl.dev&article=5
PECL Advantages• Code reviews
• See https://wiki.php.net/internals/review_comments
• Help from other devs with internal API changes (if in PHP source control)• https://svn.php.net/viewvc?view=revision&revision=297236
• Advertising and individual release cycles• http://pecl.php.net/news/
• pecl command line integration• actually just integration with PEAR installer (which support
binaries/compiling) and unique pecl channel
• php.net documentation!
PECL Problems• Has less oversight into code quality
• peclqa?• not all source accessible
• no action taken for abandoned code• still has “siberia” modules mixed with “need a maintainer”
• never enough help • tests• bug triaging• maintainers• code reviews• docs!
• no composer integration• Half the code in git, half in svn still, half… elsewhere …
“It’s really free as in pull request”- me
My extension didn’t make it faster!• PHP is usually not the real bottleneck• Do full stack profiling and benchmarking to see if PHP is the real
bottleneck• If PHP IS the real bottleneck you’re awesome – and you need to be
writing stuff in C or C++• Most times your bottleneck is not PHP but I/O
What about other languages?• Ruby gem
• Will compile and install
• Node’s npm• Will compile and install
• Perl’s CPAN• Written in special “xs” language• Will compile and install
• Python• Mixed bag? Distutils can install or grab a binary
FFITalk C without compiling
What is FFI?• Foreign Function Interface• Most things written in C use libffi• https://github.com/libffi/libffi
Who has FFI?• Java calls it JNI• HHVM calls it HNI• Python calls it “ctypes” (do not ask, stupidest name ever)• C# calls it P/Invoke• Ruby calls it FFI• Perl has Inline::C (a bit of a mess)• PHP calls it…
FFI
Oh wait…• PHP’s FFI is rather broken• PHP’s FFI has no maintainer• It needs some TLC• There’s MFFI but it’s not done
• https://github.com/mgdm/MFFI
• Are you interested and not afraid?
For the future?• More SAPIs?
• Websockets• PSR-7• Other ideas?
• Fix server-tests.php so we can test SAPIs • Only CGI and CLI are currently tested well
• More extensions• Guidelines for extensions• Better documentation• Builds + pickle + composer integration
About Me http://emsmith.net [email protected] twitter - @auroraeosrose IRC – freenode –
auroraeosrose #phpmentoring https://joind.in/talk/67433