Juglouvain http revisited

51
HTTP revisited & some Java networking Java User Group Louvain-La-Neuve @ EPHEC 20/11/2014 Marc Tritschler 24/10/2014 Copyrigth (c) Marc Tritschler 1

Transcript of Juglouvain http revisited

Page 1: Juglouvain http revisited

HTTP revisited& some Java networking

Java User Group Louvain-La-Neuve @ EPHEC

20/11/2014

Marc Tritschler

24/10/2014 Copyrigth (c) Marc Tritschler 1

Page 2: Juglouvain http revisited

Program

1.Introduction

2.Internet Stack (reminder ?)

3.Java and the Internet stack

4.Coding time

24/10/2014 Copyrigth (c) Marc Tritschler 2

PLEASE PLEASE INTERRUPT ME (IRQ-0 or any other )

Page 3: Juglouvain http revisited

1. Introduction

Already heard of Gopher ?

Internet = HTTP

24/10/2014 Copyrigth (c) Marc Tritschler 3

Page 4: Juglouvain http revisited

Internet = HTTP

• Google• Facebook• Gmail• Yahoo• Youtube• Twitter• Amazon• …24/10/2014 Copyrigth (c) Marc Tritschler 4

HTTP

Page 5: Juglouvain http revisited

Almost EVERYTHING runs over HTTP

• HTTP ~ 75 % of traffic (http://www.caida.org/publications/papers/1998/Inet98/Inet98.html MUST

READ)

– WebServices (SOAP & REST)– HTML– AJAX– Email (webmail)

• Exceptions• Email (smtp/imap/pop3)• DNS• FTP• WebSocket which 'upgrades' from HTPP (previous JUG)

24/10/2014 Copyrigth (c) Marc Tritschler 5

Page 6: Juglouvain http revisited

HTML, JS, GIF, MP4 … over HTTP

24/10/2014 Copyrigth (c) Marc Tritschler 6

Page 7: Juglouvain http revisited

2. The Internet Stack

Forget about

the

7 layers OSI model

24/10/2014 Copyrigth (c) Marc Tritschler 7

Page 8: Juglouvain http revisited

The Internet Stack (4 layers)

TCP/IP familly

HTTP

Physical Layer

SSL

80 443

Part of OS. C/C++

In the JRE. Java

Number of Job & Products Opportunities

ElectronicsAssembly

24/10/2014 Copyrigth (c) Marc Tritschler 8

My App

Page 9: Juglouvain http revisited

Where's HTML in this Stack ???

DO NOT MIX DATA, API and PROTOCOL•Data (= contents = payload = BYTES)– Binary vs Text– HTML, CSS, XML, JavaScript, JPEG, MP4, …– Text Data Encodings (UTF-8)

•API vertical links (no bytes on the wire)•Protocol horizontal links•AJAX = JavaScript performing HTTP requests

24/10/2014 Copyrigth (c) Marc Tritschler 9

Page 10: Juglouvain http revisited

TCP portshttp://fr.wikipedia.org/wiki/Liste_de_ports_logiciels

Well Known (0 – 1024)20, 21 FTP

22 SSH

23 Telnet

25, 110 SMTP/POP3

80 HTTP

53 DNS

137 … 139 NETBIOS

389 LDAP

443 HTTPS

Others (1025-65535)1521 Oracle DB

8080 http proxies, Tomcat

24/10/2014 Copyrigth (c) Marc Tritschler 10

Page 11: Juglouvain http revisited

HTTP versions

• HTTP 1.0 @DEPRECATED– each request/response new TCP connection (= exchange of 3 TCP packets

(SYN, SYN/ACK, ACK))

• HTTP 1.1 CURRENT– Keep TCP session

• HTTP 2.0 FUTURE (around DEC 2014)– Negotiation (1.1, 2.0, other protocols)– Close to 1.1 (methods, status codes, …)– Server Push– Fix HOL problem– Loads page elements in parallel over single TCP connection

http://en.wikipedia.org/wiki/HTTP/2 for more info24/10/2014 Copyrigth (c) Marc Tritschler 11

Page 12: Juglouvain http revisited

HTTP Refresher • RFC/IETF Standards (read this only if …)• Simple request/response• Header + [Body]• Stateless• Bytes and Chars (use UTF-8 encoding)• Synchronous HALF-DUPLEX (request ALWAYS

initiated by the client remeber the problems for interactive games

• Can be verbose (http headers) (~600 bytes for simple Hello World)

24/10/2014 Copyrigth (c) Marc Tritschler 12

Page 13: Juglouvain http revisited

HTTP Overview

24/10/2014 Copyrigth (c) Marc Tritschler 13

REQUEST (GET, POST, …)

RESPONSE (CODE + [DATA])1xx : Informational - Request received, continuing process2xx : Success - The action was successfully received, understood, and accepted3xx : Redirection - Further action must be taken in order to complete the request4xx : Client Error - The request contains bad syntax or cannot be fulfilled5xx : Server Error - The server failed to fulfill an apparently valid request

Client Server

Page 14: Juglouvain http revisited

HTTP Request : methodshttp://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

• Safe (GET/HEAD) & Idempotents methods• GET, HEAD• OPTIONS• POST, PUT• DELETE• TRACE• CONNECT FREEDOM

24/10/2014 Copyrigth (c) Marc Tritschler 14

Page 15: Juglouvain http revisited

HTTP Responses : Status Codes

24/10/2014 Copyrigth (c) Marc Tritschler 15

• 200 OK• 400 Bad Request• 401 Unauthorized (WWW-Authenticate header)• 403 Forbidden• 404 Not Found• 407 Proxy Authentication Required (Proxy-Authenticate header)• 500 Internal Server Error

• Complete List

http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html#sec6.1.1

Page 16: Juglouvain http revisited

HTTP Headershttp://en.wikipedia.org/wiki/List_of_HTTP_header_fields

• A lot of "standards" and "non standards" defined … a little bit messy

• Firefox Dev console

24/10/2014 Copyrigth (c) Marc Tritschler 16

Page 17: Juglouvain http revisited

HTTP Request ExamplePOST http://sghrsot.cc.cec.eu.int:1045/hermes/Proxy/1.17/DocumentWebServicePS HTTP/1.1Accept-Encoding: gzip,deflateContent-Type: text/xml;charset=UTF-8SOAPAction: ""User-Agent: Jakarta Commons-HttpClient/3.1Host: host1.domain1.company :1045Content-Length: 585

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:typ="http://xx.xxxxxx.eu/sg/hrs/types"> <soapenv:Header/> <soapenv:Body> <typ:getDocument> <typ:header> <typ:userName>xyz</typ:userName> <typ:ticket>onetimeticket</typ:ticket> <typ:applicationId>myapp</typ:applicationId> </typ:header> <typ:documentId>080166e48102103b</typ:documentId> </typ:getDocument> </soapenv:Body></soapenv:Envelope>

24/10/2014 Copyrigth (c) Marc Tritschler 17

Page 18: Juglouvain http revisited

HTTP Response ExampleHTTP/1.1 200 OKDate: Mon, 20 Oct 2014 16:12:22 GMTContent-Length: 9159Content-Type: text/xml; charset=utf-8

<?xml version="1.0" encoding="UTF-8"?><S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/"><S:Body><typ:getDocumentResponse xmlns:typ="http://xx.xxxxxx.eu/sg/hrs/types"><typ:document><typ:documentId>080166e48102103b</typ:documentId> … (stripped)</typ:getDocumentResponse></S:Body></S:Envelope>

24/10/2014 Copyrigth (c) Marc Tritschler 18

Page 19: Juglouvain http revisited

Quizz Time

Guess number of HTTP requests per web site …

24/10/2014 Copyrigth (c) Marc Tritschler 19

Page 20: Juglouvain http revisited

Example 1 : www.lesoir.be

24/10/2014 Copyrigth (c) Marc Tritschler 20

HOW Many HTTP requests ?

Page 21: Juglouvain http revisited

Example 2: mon-programmetv.be

24/10/2014 Copyrigth (c) Marc Tritschler 21

HOW Many HTTP requests ?

Page 22: Juglouvain http revisited

Example 3: www.google.be

24/10/2014 Copyrigth (c) Marc Tritschler 22

HOW Many HTTP requests ?

Page 23: Juglouvain http revisited

Number of HTTP requests per single web site visited

1. 200 requests/responses for www.lesoir.be OMG !!!

2. It's full of advertisements (visible) and invisible personal tracking systems (cookies, javascript, re-directs, …)

3. js is evill

Conclusion : YOU ARE NOT ANONYMOUS

24/10/2014 Copyrigth (c) Marc Tritschler 23

Page 24: Juglouvain http revisited

How your browser gets its proxy ?

• Web Proxy Autodiscovery Protocol

24/10/2014 Copyrigth (c) Marc Tritschler 24

Page 25: Juglouvain http revisited

HTTP Advanced

• Authentication• HTTP Proxies• HTTP Tunnelling• HTTP Pipelining• HTTPS

24/10/2014 Copyrigth (c) Marc Tritschler 25

Page 26: Juglouvain http revisited

HTTP Authentication(RFCs 2616, 2617, 7235)

BasicThe client sends the user name and password as unencrypted base64 encoded text. It should only be used with HTTPS, as the password can be easily captured and reused over HTTP.

DigestThe client sends a hashed form of the password to the server. Although, the password cannot be captured over HTTP, it may be possible to replay requests using the hashed password.

NTLM (Windows)This uses a secure challenge/response mechanism that prevents password capture or replay attacks over HTTP.

24/10/2014 Copyrigth (c) Marc Tritschler 26

Page 27: Juglouvain http revisited

HTTP Authentication401 – Access Denied

24/10/2014 Copyrigth (c) Marc Tritschler 27

GET /securefiles/ HTTP/1.1

HTTP/1.1 401 Access DeniedWWW-Authenticate: Basic realm="My Server"Content-Length: 0

GET /securefiles/ HTTP/1.1Host: www.httpwatch.comAuthorization: Basic aHR0cHdhdGNoOmY=

Client(browser)

Server

User types his/her password

Page 28: Juglouvain http revisited

HTTP Authentication407 – Proxy Authentication Required

• Same as 401 excepted proxy MUST return a Proxy-Authenticate header

• Browser asks user to type his/her password

24/10/2014 Copyrigth (c) Marc Tritschler 28

Page 29: Juglouvain http revisited

HTTP Proxy/Reverse Proxy

• Proxy : local net internet• Reverse Proxy: internet local net

24/10/2014 Copyrigth (c) Marc Tritschler 29

Client Client

Direct Connection

HTTP

HTTP

Proxyied Connection

Client Proxy ServerHTTP

Page 30: Juglouvain http revisited

HTTP Tunnelling

24/10/2014 Copyrigth (c) Marc Tritschler 30

HTTP

CONNECT

Client Proxy ServerTCP

Port forwarding

Page 31: Juglouvain http revisited

HTTP Pipelininghttp://en.wikipedia.org/wiki/HTTP_pipelining

24/10/2014 Copyrigth (c) Marc Tritschler 31

Page 32: Juglouvain http revisited

HTTPS

• HTTP over SSL• Secure Browsing ?– HeartBleed – SSL3.0 recently found weak– TLS 1.0 min– Root certificate

24/10/2014 Copyrigth (c) Marc Tritschler 32

Page 33: Juglouvain http revisited

3. Java & The Internet Stack

24/10/2014 Copyrigth (c) Marc Tritschler 33

?

Page 34: Juglouvain http revisited

Java and Internet

• Java is (my favorite) language to work @ application layer, up to TCP/IP … (wait next slide )

• Java has no access to protocols below IP (needs call to native libs, not in the HTTP scope)

• Don't underestimate the complexity of SSL interactions, even in Java !!!

24/10/2014 Copyrigth (c) Marc Tritschler 34

Page 35: Juglouvain http revisited

Java and the Internet Stack

24/10/2014 Copyrigth (c) Marc Tritschler 35

TCP/UDP

HTTP

Physical Layer

Socket API (java.net) or JSSE (javax.ssl)

IPv4 and IPv6

ICMP, ARP, DHCP, …

WebSocke

t

SMTP/POP3FTP

DNS

Web Services

53

80/443

25, 110

JavaMailjavax.mail

Web Browser

Implemented in the OS. Java has limited access via API

Implemented in OS or hardware. No 'direct' access

Available in Java SE

Open Source or future

My Application My AppONLY FOCUS ON YOUR BUSINESS

JRE

Linux

Page 36: Juglouvain http revisited

API vs Protocol

• API vertical links (no bytes on the wire)• Protocol horizontal links

24/10/2014 Copyrigth (c) Marc Tritschler 36

Page 37: Juglouvain http revisited

Socket API(java.net)

• Most important (access • Server Sockets• Client sockets• Base for YOUR protocol !• Base for HTTP, SMTP, …

24/10/2014 Copyrigth (c) Marc Tritschler 37

Page 38: Juglouvain http revisited

Socket API - Main Classes

• Socket & ServerSockethttps://docs.oracle.com/javase/7/docs/api/java/net/Socket.html (Java 7)

https://docs.oracle.com/javase/8/docs/api/java/net/Socket.html (Java 8 :-))

• URL• URLConnection• HttpURLConnection• …• java.net package http://docs.oracle.com/javase/8/docs/api/java/net/package-summary.html • Stack properties http://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html

24/10/2014 Copyrigth (c) Marc Tritschler 38

Page 39: Juglouvain http revisited

SMTP/POP3(java mail)

https://javamail.java.net/nonav/docs/api/com/sun/mail/smtp/package-summary.html

24/10/2014 Copyrigth (c) Marc Tritschler 39

Page 40: Juglouvain http revisited

SSL/TLS(java.net javax.ssl)

• Socket API(java.net) JSSE (javax.ssl)

• Sockets– (Client) Socket SSLSocket– ServerSocket SSLServerSocket

• HttpUrlConnection HttpsUrlConnection

24/10/2014 Copyrigth (c) Marc Tritschler 40

Page 41: Juglouvain http revisited

Others

• WebSocket– http://www.websocket.org/

• Java Specifics– RMI– JMX

• Web Services– SOAP JAX-WS– REST JAX-RS

24/10/2014 Copyrigth (c) Marc Tritschler 41

Page 42: Juglouvain http revisited

Part 3: Code Time

WARNING

Several packages and many classes

challenge is to use the right classes

24/10/2014 Copyrigth (c) Marc Tritschler 42

Page 43: Juglouvain http revisited

Setup - Toolbox

• Developer– Java JDK (of course)– Editor (Eclipse, NetBeans, …)

• Client Side– Putty– Web Browser + DEV console !

(Chrome, IE, FireFox, …)– soapUI (Web Services)

• Server Side– Apache HTTP server (min)– Apache Tomcat

(recommended)– Full JEE (GlassFish, WildFly, …)

• Cloud– Red Hat OpenShift– …

24/10/2014 Copyrigth (c) Marc Tritschler 43

https://github.com/tritschler/LLN_JUG/tree/master/2014_11_20

Page 44: Juglouvain http revisited

Example 1 – Echo protocol(ClientSocket & ServerSocket)

• No HTTP, directly over TCP

https://docs.oracle.com/javase/tutorial/displayCode.html?code=https://docs.oracle.com/javase/tutorial/networking/sockets/examples/EchoServer.java

DON'T DO THIS IN REAL LIFE

24/10/2014 Copyrigth (c) Marc Tritschler 44

Page 45: Juglouvain http revisited

Example 1 - Echo

24/10/2014 Copyrigth (c) Marc Tritschler 45

TCP

Physical Layer

Socket API (java.net)

IPv4

Echo (Client)JVM

TCP

Physical Layer

Socket API (java.net)

IPv4

Echo (Server)JVM

Hello

Hello IP

real data flow

logical flow

Page 46: Juglouvain http revisited

Example 2 – Basic Web Crawler(URL, HttpUrlConnection)

•Example 1 : no proxy•Example 2 : proxy + basic http authentication

24/10/2014 Copyrigth (c) Marc Tritschler 46

DON'T DO THIS IN REAL LIFE

Page 47: Juglouvain http revisited

Java HTTP Client App

24/10/2014 Copyrigth (c) Marc Tritschler 47

TCP/UDP

HTTP

Physical Layer

Socket API

IP

My Application…

(JVM)

ANY HTTP Server(Apache, Nginex,

Tomcat, Jboss, Microsoft IIS, …) implemented in

any programming language (Java, PHP, C,

…)

ANY OS (Linux, Windows, Mac OS, …)

Page 48: Juglouvain http revisited

Example 3 – ServletNo networking code on the Server Side

• Servlet = java spec for writing the HTTP server side• No networking code ! (thanks to your AS)• Web.xml + class extends HttpServlet

1. Browser – Servlet

2. Browser – HttpTrace – Servlet

3. HttpUrlConnection (no proxy) – Servlet

4. HttpUrlConnection – HttpTrace – Servlet

24/10/2014 Copyrigth (c) Marc Tritschler 48

DON'T DO THIS IN REAL LIFE

Page 49: Juglouvain http revisited

Java HTTP Client App – Java Servlet

24/10/2014 Copyrigth (c) Marc Tritschler 49

ANY HTTP Server+ Servlet Container

Apache Tomcat

ANY OS (Linux, Windows, Mac OS, …)

ANY HTTP Client(Web Browser, …)

ANY OS (Linux, Windows, Mac OS, …)

Page 50: Juglouvain http revisited

Example 4 – HTTP proxy

• Start local Tomcat• Start HttpTrace• Start Browser and point to localhost• Launch httpclient

24/10/2014 Copyrigth (c) Marc Tritschler 50

Page 51: Juglouvain http revisited

Resources(on the web of course, over HTTP )

24/10/2014 Copyrigth (c) Marc Tritschler 51