Programming for Bioinformatics HTTP Perl Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 1 Web Clients with Perl HTML document with links Click Web browser User Web server some application Web client (Perl) Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 2 Web Clients with Perl and also Web Servers ... HTML document with links Click Web browser User Web client (Perl) Web server some application Web server (Perl) Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 3 Web Clients with Perl and also Web applications ... HTML document with links Click Web browser User Web client (Perl) Web server some application CGI script (Perl) Application (Perl, ...) Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 4 Web Clients: Introduction HTML document with links Click Web browser User Web client (Perl) HyperText Transfer Protocol Web server some application CGI script (Perl) Application (Perl, ...) Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 5 Web Clients: Introduction • • HTTP = Hyper-Text Transfer Protocol protocol to specify data transfer for the WWW (i.e. data transmitted when clicking a hyperlink (URL) or submitting a form) • URL = Uniform Resource Locator (http://<host>/<path_to_document>) • HTTP-client = program communicating via the HTTP protocol (with a Web/HTTP server) • Why writing web client programs? – automating tedious clicking – downloading many pages, many data items – schedule tasks at certain times – ... History: – Tim Berners-Lee (CERN, CH): WWW – 1993 U.Illinois, Urbana-Champaign: Mosaic Browser – 1994 Netscape Navigator • Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 6 Web Clients: Introduction HTML document with links Click Web browser User Web client (Perl) Web server some application CGI script (Perl) Application (Perl, ...) Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 7 Web Clients: Introduction HTML document with links Click Web browser User Web client (Perl) listen to a port of the network Request Web server some application CGI script (Perl) Application (Perl, ...) Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 8 Web Clients: Introduction Web browser listen to a port of the network Request Berkeley Sockets Web server Berkeley Sockets TCP/IP Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 9 Web Clients: Sample Session Web browser Web server http://www.lmu.de/ Use http, hypertext transfer protocol ... and this document contact this host Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 10 http://www.lmu.de/ Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 11 Web Clients: Sample Session Web browser Web server http://www.lmu.de/ Use http, hypertext transfer protocol ... and this document contact this host a GET Request Don‘t close the connection after Client software server answer revision server name used in request client can GET / HTTP/1.0 accept these Connection: Keep-Alive documents User-Agent: Mozilla/3.0 Host: www.lmu.de Accept: image/gif, image/jpeg, */* Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 12 Web Clients: Sample Session Web browser Web server Request http://www.lmu.de/ Response GET / HTTP/1.0 Connection: Keep-Alive User-Agent: Mozilla/3.0 Host: www.lmu.de Accept: image/gif, image/jpeg, */* HTTP/1.0 200 Ok Date: Tue, 01 Jan 2002, 10:00 GMT Server: Apache/1.1.1 Content-type: text/html Content-length: ... Last-modified: ... <title>Example</title> [HTML text ...] Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 13 <img src=„images/lmu.gif“> Web Clients: Sample Session > telnet www.lmu.de 80 Trying 141.84.120.25... Connected to www.lmu.de. Escape character is '^]'. GET / HTTP/1.0 host:www.lmu.de HTTP/1.1 200 OK Server: Microsoft-IIS/4.0 Content-Location: http://141.84.120.25/index.htm Date: Wed, 16 Jan 2002 19:41:36 GMT Content-Type: text/html Accept-Ranges: bytes Last-Modified: Thu, 10 Jan 2002 18:17:39 GMT ETag: "632921639ac11:1a6f4" Content-Length: 9811 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> ... Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 14 Web Clients: Sample Session Web browser Web server Request http://www.lmu.de/ Response GET /images/lmu.gif HTTP/1.0 Connection: Keep-Alive User-Agent: Mozilla/3.0 Host: www.lmu.de Accept: image/gif, image/jpeg, */* HTTP/1.0 200 Ok Date: Tue, 01 Jan 2002, 10:00 GMT Server: Apache/1.1.1 Content-type: text/html Content-length: ... Last-modified: ... [GIF-Data] Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 15 Web Clients: Sample Session Web browser Web server Request http://www.lmu.de/ Response GET /example.html HTTP/1.0 Connection: Keep-Alive User-Agent: Mozilla/3.0 Host: www.lmu.de Accept: image/gif, image/jpeg, */* HTTP/1.0 200 Ok Date: Tue, 01 Jan 2002, 10:00 GMT Server: Apache/1.1.1 Content-type: text/html Content-length: ... Last-modified: ... [HTML Data] Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 16 HTTP log GET / HTTP/1.1 Host: darwin.bio.informatik.uni-muenchen.de:6419 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; de-DE; rv:0.9.2) Gecko/20010726 Netscape6/6.1 Accept: text/xml, application/xml, application/xhtml+xml, text/html;q=0.9, image/png, image/jpeg, image/gif;q=0.2, text/plain;q=0.8, text/css, */*;q=0.1 Accept-Language: de-DE Accept-Encoding: gzip,deflate,compress,identity Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66 Keep-Alive: 300 Connection: keep-alive If-Modified-Since: Wed, 09 Jan 2002 19:13:01 GMT Cache-Control: max-age=0 HTTP/1.1 200 OK Date: Wed, 09 Jan 2002 19:13:37 GMT Server: Apache/1.3.20 (Linux/SuSE) Content-Location: index.html.de Vary: negotiate,accept-language TCN: choice Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html Content-Language: de 879 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML> <HEAD> <TITLE>Webserver Testseite</TITLE> ... Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 17 Schedule WS 2010/11 9 ct 10:00 10:45 11:15 Monday Tuesday Wednesday Thursday Friday Groups & Organisation (Zimmer) Perl Intro (Küffner) Tools & Databases (Petri) Alignment (Erhard) SSP (Secondary Introduction (Zimmer) Perl Intro (Küffner) Tools & Databases (Petri) Alignment (Erhard) SSP (Secondary Assignments I Assignments / Surfing II / Skripting Assignments III / Skripting Assignments Programming Assignments Programming Tools (Unix/ cvs/Make/...) (Zimmer) HTML/HTTP (Küffner) Databases/ SQL (Windhager) Alignment (Erhard) Validation (Petri) Unix/Shell (Petri) Practical Perl Perl/CGI (Küffner) Java/JDBC Packages (Windhager) Programming Programming Programming structure prediction) (Zimmer) structure prediction) (Zimmer) Validation (Petri) Programming Programming 12:45 18:00 24:00 Ralf Zimmer, Institut für Informatik, Lehrstuhl für Praktische Informatik und Bioinformatik, WS 2010/2011: Bioinformatics Programming Course 18