CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis
-
Upload
paki-beasley -
Category
Documents
-
view
39 -
download
5
description
Transcript of CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis
![Page 1: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/1.jpg)
CSN11121System Administration and Forensics Week 5: Essential Apache and Log Analysis
Module Leader: Dr Gordon RussellLecturers: G. Russell, R.Ludwiniak
Aliases: CSN11122 (Distance Learning Version)
![Page 2: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/2.jpg)
This lecture
• Configuring Apache• Log analysis• Discussions
![Page 3: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/3.jpg)
Configuring Apache
![Page 4: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/4.jpg)
Apache
• Very well known and respected http server.• Used commercially.• Freely available from http://www.apache.org• Plenty of plugins.• Relatively easy and flexible to configure.• Fast and Reliable.
![Page 5: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/5.jpg)
Server Architectures
• In most designs of server, you either use– Threaded model– Forking model– Asynchronous Architecture
• A threaded model needs special OS support to provide lightweight threads. Not used in Apache for security and reliability reasons.
• Forking means that each new request which arrives is handled by a whole process. This is the Apache way.
• Asynchronous. Some web servers exist with this model, where one process handles everything with complex IO code. Good for fast processing of simple web pages.
![Page 6: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/6.jpg)
Apache Forking Model
MUX
Child
Child
Child
Child
http request
Idle ChildGet data from disk
Response
![Page 7: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/7.jpg)
Initial SettingsStartServers 8MinSpareServers 5MaxSpareServers 20MaxClients 150MaxRequestsPerChild 1000
• These options are important, but often the least likely to be changed from the defaults!
![Page 8: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/8.jpg)
Important Files
• /etc/init.d/httpd – the server control script• /etc/httpd/conf/http.confg – the main conf file.
• Remember when changing the configurations it is only reread on a server reload or restart.
• Errors and other details are logged by default in /var/log/httpd/ as access_log, error_log, as suexec.log.
![Page 9: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/9.jpg)
Reload or Restart• Reload is the best option to use.• With a reload, apache checks your configuration file, and
switches to it only if it contains no errors.• If it has errors, it keeps using the old configuration.• This allows you to reconfigure a server with no downtime.• Restart shuts down then starts the server…• Look in the error log for help (e.g. /var/log/httpd/error_log),
or syslog (e.g. /var/log/messages).• Remember to use the service command for this:
– Service httpd start|stop|reload|restart|status• You can easily make errors in the config file. You can check for errors
using– Service httpd configtest
![Page 10: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/10.jpg)
Mimic a Browser
• To understand how a sever is running is it sometimes useful to make requests at the keyboard of a server and see the results as text.
• Telnet can do this, so long as you have learned some basic HTTP commands.
• The two important ones are:– HEAD – Give information on a page.– GET – Give me the whole page.
![Page 11: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/11.jpg)
• In HTTP 1.1 we can use virtual hosts.• This allows multiple hosts to share a single server.• Each host has a different name.• The name of the host you want to answer a query is given as part of a
page request.• This is only supported in HTTP 1.1 and beyond.
![Page 12: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/12.jpg)
$ telnet linuxzoo.net 80HEAD / HTTP/1.1Host: linuxzoo.net
HTTP/1.1 200 OKDate: Mon, 01 Nov 2008 15:06:44 GMTServer: Apache/2.0.46 (Red Hat)Last-Modified: Fri, 29 Oct 2008 14:47:22 GMTETag: "4981dd-920-22ea7280"Accept-Ranges: bytesContent-Length: 2336Content-Type: text/html; charset=UTF-8
![Page 13: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/13.jpg)
$ telnet linuxzoo.net 80HEAD / HTTP/1.1Host: db.grussell.org
HTTP/1.1 200 OKDate: Mon, 01 Nov 2008 15:08:52 GMTServer: Apache/2.0.46 (Red Hat)Last-Modified: Thu, 21 Oct 2008 09:12:33 GMTETag: "3c8066-a37-86c9a240"Accept-Ranges: bytesContent-Length: 2615Content-Type: text/html; charset=UTF-8
![Page 14: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/14.jpg)
VirtualHosts
• The sharing of a single IP to provide multiple hostnames is well supported in Apache.
• The part of the conf file which handles this is called <VirtualHost>• Each part holds a list of hostnames it can handle• The first host found in the file is always considered the default, so if no
VirtualHost section matches the first block is done instead.
![Page 15: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/15.jpg)
<VirtualHost> ServerAdmin [email protected] DocumentRoot /home/gordon/public_html ServerName grussell.org ServerAlias www.grussell.org grussell.org.uk ErrorLog logs/gr-error_log CustomLog logs/gr-access_log combined</VirtualHost>
![Page 16: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/16.jpg)
public_html
• Where apache runs on a server used by many different servers, it would be useful for each user to be able to build their own web pages which the server could serve.
• But the virtualhost configuration takes only a single document root, and each user has their own directories in /home.
• You could make the root /home– All of the files in /home would be accessible, not just web pages.– It’s a bit disgusting…
• Instead, apache supports web pages appearing in a users home directory, under the subdirectory public_html.
![Page 17: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/17.jpg)
public_html access• Urls of the form
– http://linuxzoo.net/~gordon/file.html
• Refer to– /home/gordon/public_html/file.html
• This feature must first be switched on in httpd.conf.• To activate it, find the line
– UserDir disable
• Then either delete the line, or put “#” (the comment character) in front of it.
• Then find the following line and delete the ‘#’ character.– #UserDir public_html
• Remember to reload the server.
![Page 18: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/18.jpg)
Linuxzoo tutorials• Each time you book a linuxzoo machine, you will likely get a different
IP and hostname.• Each time you come in, check your hostname with “hostname”.
$ hostnamehost-5-5.linuxzoo.net
• In this example, virtual hosts vm-5-5.linuxzoo.net, as well as host-5-5 and web-5-5 will be proxied to your machine.
• Warning: If the server on which your virtual machine fails, you will be moved to a different machine and a different IP. You need to check your hostname when you boot!
![Page 19: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/19.jpg)
Web access from the prompt
• The prompt is fast and convenient for admin purposes, but when you are debugging http sometimes “telnet” is not sufficient.
• There are a few other tools you can use at the prompt.– elinks– lwp-request– wget
• However, there is no simple replacement for actually using a real browser to check your pages.
![Page 20: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/20.jpg)
$ elinks http://linuxzoo.net
![Page 21: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/21.jpg)
Copy http to your directory• lwp-request http://linuxzoo.net > file.html
– The data is obtained and then printed to the screen.– In this case that is redirected to file.html
• wget http://linuxzoo.net
$ wget http://linuxzoo.net--19:20:11-- http://linuxzoo.net/Resolving linuxzoo.net... 146.176.166.1Connecting to linuxzoo.net|146.176.166.1|:80... connected.HTTP request sent, awaiting response... 200 OKLength: 4785 (4.7K) [text/html]Saving to: `index.html'100%[=======================================>] 4,785 --.-K/s in 0s19:20:11 (304 MB/s) - `index.html' saved [4785/4785]
![Page 22: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/22.jpg)
SELinux and Apache
• SELinux secures apache, and SELinux security of files in public_html is by default quite strong.
• Check if SELinux allows files to be published from public_html by– getsebool httpd_read_user_content– If this is 0 then publishing files is forbidden.
• Set SELinux to allow public_html publishing using:– setsebool -P httpd_read_user_content 1– This may take 20 or more seconds. Be patient. – The setting will be forgotten if you get a new image in the linuxzoo interface.
• SELinux requires the file security (shown by ls –Z) to be:– unconfined_u:object_r:httpd_user_content_t:s0– However this should happen automatically provided you create files in public_html– You can set the type of say filename.html (but remember you should not have to) using:
• chcon –t httpd_user_content_t filename.html
![Page 23: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/23.jpg)
Log Analysis
![Page 24: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/24.jpg)
Logs
• Apache produces two types of log files– Error Logs– Access Logs
• Error logs are useful for debugging• Access logs are excellent for monitoring how your site is being used.
– Fun for people who have hobby sites– Life or death if your business relies on the web site.
![Page 25: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/25.jpg)
Where are the logs• Normally they go to /var/log/httpd/access_log and error_log• In a virtual host we set them to what we liked:
<VirtualHost>…
ErrorLog logs/gr-error_log CustomLog logs/gr-access_log combined</VirtualHost>
![Page 26: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/26.jpg)
Logging in /var/log/http access file
• The normally used log format is called “combined”.• It contains significant amounts of information about each page
request.• Specifically, the log format is:
%h %l %u %t %r %>s %b Referrer UserAgent
![Page 27: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/27.jpg)
%h %l %u %t %r %>s %b Referrer UserAgent• h – IP of the client• l – useless ident info• u – username in basic authentication• t – time of request• r – the request itself• s – The response code (e.g. 200 is a successful request)• b – size of the response page• Referrer – who the client things told it to come here• User Agent – identification info of the browser
![Page 28: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/28.jpg)
Analysing the log
• The log is useful in itself for checking the proper function of the server.• However, traffic analysis is also valuable.• There are a number of tools available to do this.• One of the best free ones is webaliser.
![Page 29: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/29.jpg)
Webaliser Summary
![Page 30: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/30.jpg)
Analysis
• The summer is quiet for linuxzoo.• Students are enthusiastic in October…• After that it settles down to “kept busy”.
![Page 31: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/31.jpg)
Per day activity – October
![Page 32: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/32.jpg)
• I wonder which day was the first tutorial?• Look at the 7 day oscillations. This is common in many web sites.• Who stole all my web site data on the 25th?
![Page 33: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/33.jpg)
Hour analysis – October
![Page 34: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/34.jpg)
• Peak learning time (so they say) is 11am.• Students here seem to like 9am-4pm.• American students produce another bump later at night.
![Page 35: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/35.jpg)
Users
![Page 36: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/36.jpg)
Referrer Info
![Page 37: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/37.jpg)
What search terms?
![Page 38: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/38.jpg)
Where from?
![Page 39: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/39.jpg)
Google Analytics• Another approach to web logging is to use JavaScript embedded in
each web page.• This does away with the need to access the web log.
– Good if you don’t have access!
• It does mean that– You only get logs where there is javascript switched on.– Each page is slowed by having extra stuff on it.– It’s a little more complex.
![Page 40: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/40.jpg)
db.grussell.org
![Page 41: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/41.jpg)
![Page 42: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/42.jpg)
Logging Summary• What is best?• I have used both and have mixed feelings…• Things to consider
– Convenience– Reliability– Availability– Performance– Cost– Privacy– Complexity
![Page 43: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/43.jpg)
Discussions
![Page 44: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/44.jpg)
Discussion
• Apache runs as a user, usually “apache” or “httpd”. For apache to serve a file from a user’s public_html directory, what permissions would be required?
![Page 45: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/45.jpg)
Discussion
• Here are some mock exam questions you should now be able to answer:
![Page 46: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/46.jpg)
Question 1
• To test a web server which is hosting the virtual host “grussell.org”, using only telnet, what would you type at the telnet prompt?
![Page 47: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/47.jpg)
Question 2
What fields would you expect to have to define in a VirtualHost definition in apache?
![Page 48: CSN11121 System Administration and Forensics Week 5: Essential Apache and Log Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062408/568137e9550346895d9f9d59/html5/thumbnails/48.jpg)
Question 3
• Below is a line from a webserver logfile:
157.55.18.25 - - [31/Aug/2011:12:48:04 +0100] "GET /robots.txt HTTP/1.1" 200 48 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
• What kind of request was this? Was this a successful request (i.e. was a document found)?