HTTP Servlets
A servlet is a ``module'' that can be integrated into a server
application to respond to client requests. Although a servlet need not use a
specific protocol, we will use the HTTP protocol for communication (see figure
21.1). In practice, the term servlet refers to an HTTP servlet.
The classic method of constructing dynamic HTML pages on a server is to use
CGI (Common Gateway Interface) commands. These take as argument a URL which
can contain data coming from an HTML form. The execution then produces a new
HTML page which is sent to the client. The following links describe the HTTP
and CGI protocols.
Link
http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc1945.html
Link
http://hoohoo.ncsa.uiuc.edu/docs/cgi/overview.html
It is a slightly heavyweight mechanism because it launches a new program for
each request.
HTTP servlets are launched just once, and can can decode arguments in CGI
format to execute a request. Servlets can take advantage of the Web browser's
capabilities to construct a graphical interface for an application.
Figure 21.1: communication between a browser and an Objective CAMLserver
In this section we will define a server for the HTTP protocol. We will not
handle the entire specification of the protocol, but instead will limit
ourselves to those functions necessary for the implementation of a server that
mimics the behavior of a CGI application.
At an earlier time, we defined a generic server module Gsd. Now we
will give the code to create an application of this generic server for
processing part of the HTTP protocol.
HTTP and CGI Formats
We want to obtain a server that imitates the behavior of a CGI application.
One of the first tasks is to decode the format of HTTP requests with CGI
extensions for argument passing.
The clients of this server can be browsers such as Netscape or Internet
Explorer.
Receiving Requests
Requests in the HTTP protocol have essentially three components: a method,
a URL and some data. The data must follow a particular format.
In this section we will construct a collection of functions for reading,
decomposing and decoding the components of a request. These functions can
raise the exception:
# exception
Http_error
of
string
;;
exception Http_error of string
Decoding
The function decode, which uses the helper function
rep_xcode, attempts to restore the characters which have been
encoded by the HTTP client: spaces (which have been replaced by +),
and certain reserved characters which have been replaced by their hexadecimal
code.
# let
rec
rep_xcode
s
i
=
let
xs
=
"0x00"
in
String.blit
s
(i+
1
)
xs
2
2
;
String.set
s
i
(char_of_int
(int_of_string
xs));
String.blit
s
(i+
3
)
s
(i+
1
)
((String.length
s)-
(i+
3
));
String.set
s
((String.length
s)-
2
)
'\000'
;
Printf.printf"rep_xcode1(%s)\n"
s
;;
val rep_xcode : string -> int -> unit = <fun>
# exception
End_of_decode
of
string
;;
exception End_of_decode of string
# let
decode
s
=
try
for
i=
0
to
pred(String.length
s)
do
match
s.[
i]
with
'+'
->
s.[
i]
<-
' '
|
'%'
->
rep_xcode
s
i
|
'\000'
->
raise
(End_of_decode
(String.sub
s
0
i))
|
_
->
()
done;
s
with
End_of_decode
s
->
s
;;
val decode : string -> string = <fun>
String manipulation functions
The module String_plus contains some functions for taking apart
character strings:
-
prefix and suffix, which extract the substrings to
either side of an index;
- split, which returns the list of substrings determined by a
separator character;
- unsplit, which concatenates a list of strings, inserting
separator characters between them.
# module
String_plus
=
struct
let
prefix
s
n
=
try
String.sub
s
0
n
with
Invalid_argument("String.sub"
)
->
s
let
suffix
s
i
=
try
String.sub
s
i
((String.length
s)-
i)
with
Invalid_argument("String.sub"
)
->
""
let
rec
split
c
s
=
try
let
i
=
String.index
s
c
in
let
s1,
s2
=
prefix
s
i,
suffix
s
(i+
1
)
in
s1::(split
c
s2)
with
Not_found
->
[
s]
let
unsplit
c
ss
=
let
f
s1
s2
=
match
s2
with
""
->
s1
|
_
->
s1^
(Char.escaped
c)^
s2
in
List.fold_right
f
ss
""
end
;;
Decomposing data from a form
Requests typically arise from an HTML page containing a form. The contents
of the form are transmitted as a character string containing the names and
values associated with the fields of the form. The function
get_field_pair transforms such a string into an association
list.
# let
get_field_pair
s
=
match
String_plus.split
'='
s
with
[
n;v]
->
n,
v
|
_
->
raise
(Http_error
("Bad field format : "
^
s))
;;
val get_field_pair : string -> string * string = <fun>
# let
get_form_content
s
=
let
ss
=
String_plus.split
'&'
s
in
List.map
get_field_pair
ss
;;
val get_form_content : string -> (string * string) list = <fun>
Reading and decomposing
The function get_query extracts the method and the URL from a
request and stores them in an array of character strings. One can thus use
a standard CGI application which retrieves its arguments from the array of
command-line arguments. The function get_query uses the auxiliary
function get. We arbitrarily limit requests to a maximum size of
2555 characters.
# let
get
=
let
buff_size
=
2
5
5
5
in
let
buff
=
String.create
buff_size
in
(fun
ic
->
String.sub
buff
0
(input
ic
buff
0
buff_size))
;;
val get : in_channel -> string = <fun>
# let
query_string
http_frame
=
try
let
i0
=
String.index
http_frame
' '
in
let
q0
=
String_plus.prefix
http_frame
i0
in
match
q0
with
"GET"
->
begin
let
i1
=
succ
i0
in
let
i2
=
String.index_from
http_frame
i1
' '
in
let
q
=
String.sub
http_frame
i1
(i2-
i1)
in
try
let
i
=
String.index
q
'?'
in
let
q1
=
String_plus.prefix
q
i
in
let
q
=
String_plus.suffix
q
(succ
i)
in
Array.of_list
(q0::q1::(String_plus.split
' '
(decode
q)))
with
Not_found
->
[|
q0;q|]
end
|
_
->
raise
(Http_error
("Unsupported method: "
^
q0))
with
e
->
raise
(Http_error
("Unknown request: "
^
http_frame))
;;
val query_string : string -> string array = <fun>
# let
get_query_string
ic
=
let
http_frame
=
get
ic
in
query_string
http_frame;;
val get_query_string : in_channel -> string array = <fun>
The Server
To obtain a CGI pseudo-server, able to process only the GET method, we
write the class http_servlet, whose argument fun_serv is
a function for processing HTTP requests such as might have been written for a
CGI application.
# module
Text_Server
=
Server
(struct
type
t
=
string
let
to_string
x
=
x
let
of_string
x
=
x
end);;
# module
P_Text_Server
(P
:
PROTOCOL)
=
struct
module
Internal_Server
=
Server
(P)
class
http_servlet
n
np
fun_serv
=
object(self)
inherit
[
P.t]
Internal_Server.server
n
np
method
receive_h
fd
=
let
ic
=
Unix.in_channel_of_descr
fd
in
input_line
ic
method
process
fd
=
let
oc
=
Unix.out_channel_of_descr
fd
in
(
try
let
request
=
self#receive_h
fd
in
let
args
=
query_string
request
in
fun_serv
oc
args;
with
Http_error
s
->
Printf.fprintf
oc
"HTTP error : %s <BR>"
s
|
_
->
Printf.fprintf
oc
"Unknown error <BR>"
);
flush
oc;
Unix.shutdown
fd
Unix.
SHUTDOWN_ALL
end
end;;
As we do not expect the servlet to communicate using Objective CAML's special
internal values, we choose the type string as the protocol type. The
functions of_string and to_string do nothing.
# module
Simple_http_server
=
P_Text_Server
(struct
type
t
=
string
let
of_string
x
=
x
let
to_string
x
=
x
end);;
Finally, we write the primary function to launch the service and construct an
instance of the class http_servlet.
# let
cgi_like_server
port_num
fun_serv
=
let
sv
=
new
Simple_http_server.http_servlet
port_num
3
fun_serv
in
sv#start;;
val cgi_like_server : int -> (out_channel -> string array -> unit) -> unit =
<fun>
Testing the Servlet
It is always useful during development to be able to test the parts that are
already built. For this purpose, we build a small HTTP server which sends the
file specified in the HTTP request as is. The function simple_serv
sends the file whose name follows the GET request (the second element of the
argument array). The function also displays all of the arguments passed in
the request.
# let
send_file
oc
f
=
let
ic
=
open_in_bin
f
in
try
while
true
do
output_byte
oc
(input_byte
ic)
done
with
End_of_file
->
close_in
ic;;
val send_file : out_channel -> string -> unit = <fun>
# let
simple_serv
oc
args
=
try
Array.iter
(fun
x
->
print_string
(x^
" "
))
args;
print_newline();
send_file
oc
args.
(1
)
with
_
->
Printf.printf
"error\n"
;;
val simple_serv : out_channel -> string array -> unit = <fun>
# let
run
n
=
cgi_like_server
n
simple_serv;;
val run : int -> unit = <fun>
The command run
4
0
0
3
launches this servlet on port 4003. In
addition, we launch a browser to issue a request to load the page
baro.html on port 4003. The figure 21.2 shows the
display of the contents of this page in the browser.
Figure 21.2: HTTP request to an Objective CAML servlet
The browser has sent the request GET /baro.html
to load the page, and
then the request GET /canard.gif
to load the image.
HTML Servlet Interface
We will use a CGI-style server to build an HTML-based interface to the
database of chapter 6 (see page
??).
The menu of the function main will now be displayed in a form on an
HTML page, providing the same selections. The responses to requests are also
HTML pages, generated dynamically by the servlet. The dynamic page
construction makes use of the utilities defined below.
Application Protocol
Our application will use several elements from several protocols:
-
Requests are transmitted from a Web browser to our application server
in the HTTP request format.
- The data items within a request are encoded in the format used by CGI
applications.
- The response to the request is presented as an HTML page.
- Finally, the nature of the request is specified in a format specific
to the application.
We wish to respond to three kinds of request: queries for the list of mail
addresses, queries for the list of email addresses, and queries for the state
of received fees between two given dates. We give these query
types respectively the names:
mail_addr
, email_addr
and
fees_state
. In the last case, we will also transmit two character
strings containing the desired dates. These two dates correspond to the values
of the fields start
and end
on an HTML form.
When a client first connects, the following page is sent. The names of the
requests are encoded within it in the form of HTML anchors.
<HTML>
<TITLE> association </TITLE>
<BODY>
<HR>
<H1 ALIGN=CENTER>Association</H1>
<P>
<HR>
<UL>
<LI>List of
<A HREF="http://freres-gras.ufr-info-p6.jussieu.fr:12345/mail_addr">
mail addresses
</A>
<LI>List of
<A HREF="http://freres-gras.ufr-info-p6.jussieu.fr:12345/email_addr">
email addresses
</A>
<LI>State of received fees<BR>
<FORM
method="GET"
action="http://freres-gras.ufr-info-p6.jussieu.fr:12345/fees_state">
Start date : <INPUT type="text" name="start" value="">
End date : <INPUT type="text" name="end" value="">
<INPUT name="action" type="submit" value="Send">
</FORM>
</UL>
<HR>
</BODY>
</HTML>
We assume that this page is contained in the file assoc.html
.
HTML Primitives
The HTML utility functions are grouped together into a single class called
print. It has a field specifying the output channel. Thus, it can
be used just as well in a CGI application (where the output channel is the
standard output) as in an application using the HTTP server defined in the
previous section (where the output channel is a network socket).
The proposed methods essentially allow us to encapsulate text within HTML
tags. This text is either passed directly as an argument to the method in
the form of a character string, or produced by a function. For example,
the principal method page takes as its first argument a string
corresponding to the header of the page1, and as its second argument
a function that prints out the contents of the page. The method page
produces the tags corresponding to the HTML protocol.
The names of the methods match the names of the corresponding HTML tags, with
additional options added in some cases.
# class
print
(oc0:
out_channel)
=
object(self)
val
oc
=
oc0
method
flush
()
=
flush
oc
method
str
=
Printf.fprintf
oc
"%s"
method
page
header
(body:
unit
->
unit)
=
Printf.fprintf
oc
"<HTML><HEAD><TITLE>%s</TITLE></HEAD>\n<BODY>"
header;
body();
Printf.fprintf
oc
"</BODY>\n</HTML>\n"
method
p
()
=
Printf.fprintf
oc
"\n<P>\n"
method
br
()
=
Printf.fprintf
oc
"<BR>\n"
method
hr
()
=
Printf.fprintf
oc
"<HR>\n"
method
hr
()
=
Printf.fprintf
oc
"\n<HR>\n"
method
h
i
s
=
Printf.fprintf
oc
"<H%d>%s</H%d>"
i
s
i
method
h_center
i
s
=
Printf.fprintf
oc
"<H%d ALIGN=\"CENTER\">%s</H%d>"
i
s
i
method
form
url
(form_content:
unit
->
unit)
=
Printf.fprintf
oc
"<FORM method=\"post\" action=\"%s\">\n"
url;
form_content
();
Printf.fprintf
oc
"</FORM>"
method
input_text
=
Printf.fprintf
oc
"<INPUT type=\"text\" name=\"%s\" size=\"%d\" value=\"%s\">\n"
method
input_hidden_text
=
Printf.fprintf
oc
"<INPUT type=\"hidden\" name=\"%s\" value=\"%s\">\n"
method
input_submit
=
Printf.fprintf
oc
"<INPUT name=\"%s\" type=\"submit\" value=\"%s\">"
method
input_radio
=
Printf.fprintf
oc
"<INPUT type=\"radio\" name=\"%s\" value=\"%s\">\n"
method
input_radio_checked
=
Printf.fprintf
oc
"<INPUT type=\"radio\" name=\"%s\" value=\"%s\" CHECKED>\n"
method
option
=
Printf.fprintf
oc
"<OPTION> %s\n"
method
option_selected
opt
=
Printf.fprintf
oc
"<OPTION SELECTED> %s"
opt
method
select
name
options
selected
=
Printf.fprintf
oc
"<SELECT name=\"%s\">\n"
name;
List.iter
(fun
s
->
if
s=
selected
then
self#option_selected
s
else
self#option
s)
options;
Printf.fprintf
oc
"</SELECT>\n"
method
options
selected
=
List.iter
(fun
s
->
if
s=
selected
then
self#option_selected
s
else
self#option
s)
end
;;
We will assume that these utilities are provided by the module
Html_frame.
Dynamic Pages for Managing the Association Database
For each of the three kinds of request, the application must construct a page
in response. For this purpose we use the utility module Html_frame
given above. This means that the pages are not really constructed, but that
their various components are emitted sequentially on the output channel.
We provide an additional (virtual) page to be returned in response to a
request that is invalid or not understood.
Error page
The function print_error takes as arguments a function for emitting
an HTML page (i.e., an instance of the class print) and a
character string containing the error message.
# let
print_error
(print:
Html_frame.print)
s
=
let
print_body()
=
print#str
s;
print#br()
in
print#page
"Error"
print_body
;;
val print_error : Html_frame.print -> string -> unit = <fun>
All of our functions for emitting responses to requests will take as their
first argument a function for emitting an HTML page.
List of mail addresses
To obtain the page giving the response to a query for the list of mail
addresses, we will format the list of character strings obtained by the
function mail_addresses, which was defined as part of the database
(see page ??). We will assume that this function,
and all others directly involving requests to the database, have been defined
in a module named Assoc.
To emit this list, we use a function for outputting simple lines:
# let
print_lines
(print:
Html_frame.print)
ls
=
let
print_line
l
=
print#str
l;
print#br()
in
List.iter
print_line
ls
;;
val print_lines : Html_frame.print -> string list -> unit = <fun>
The function for responding to a query for the list of mail addresses is:
# let
print_mail_addresses
print
db
=
print#page
"Mail addresses"
(fun
()
->
print_lines
print
(Assoc.mail_addresses
db))
;;
val print_mail_addresses : Html_frame.print -> Assoc.data_base -> unit =
<fun>
In addition to the parameter for emitting a page, the function
print_mail_addresses takes the database as its second parameter.
List of email addresses
This function is built on the same principles as that giving the list of mail
addresses, except that it calls the function email_addresses from
the module Assoc:
# let
print_email_addresses
print
db
=
print#page
"Email addresses"
(fun
()
->
print_lines
print
(Assoc.email_addresses
db))
;;
val print_email_addresses : Html_frame.print -> Assoc.data_base -> unit =
<fun>
State of received fees
The same principle also governs the definition of this function: retrieving
the data corresponding to the request (which here is a pair), then emitting
the corresponding character strings.
# let
print_fees_state
print
db
d1
d2
=
let
ls,
t
=
Assoc.fees_state
db
d1
d2
in
let
page_body()
=
print_lines
print
ls;
print#str
("Total : "
^
(string_of_float
t));
print#br()
in
print#page
"State of received fees"
page_body
;;
val print_fees_state :
Html_frame.print -> Assoc.data_base -> string -> string -> unit = <fun>
Analysis of Requests and Response
We define two functions for producing responses based on an HTTP request. The
first (print_get_answer) responds to a request presumed to be
formulated using the GET method of the HTTP protocol. The second
alters the production of the answer according to the actual method that the
request used.
These two functions take as their second argument an array of character
strings containing the elements of the HTTP request as analyzed by the
function get_query_string (see page ??).
The first element of the array contains the method, the second the name of
the database request.
In the case of a query for the state of received fees, the start and end dates
for the request are contained in the two fields of the form associated with
the query. The data from the form are contained in the third field of the
array, which must be decomposed by the function get_form_content
(see page ??).
# let
print_get_answer
print
q
db
=
match
q.
(1
)
with
|
"/mail_addr"
->
print_mail_addresses
print
db
|
"/email_addr"
->
print_email_addresses
print
db
|
"/fees_state"
->
let
nvs
=
get_form_content
q.
(2
)
in
let
d1
=
List.assoc
"start"
nvs
and
d2
=
List.assoc
"end"
nvs
in
print_fees_state
print
db
d1
d2
|
_
->
print_error
print
("Unknown request: "
^
q.
(1
))
;;
val print_get_answer :
Html_frame.print -> string array -> Assoc.data_base -> unit = <fun>
# let
print_answer
print
q
db
=
try
match
q.
(0
)
with
"GET"
->
print_get_answer
print
q
db
|
_
->
print_error
print
("Unsupported method: "
^
q.
(0
))
with
e
->
let
s
=
Array.fold_right
(^
)
q
""
in
print_error
print
("Something wrong with request: "
^
s)
;;
val print_answer :
Html_frame.print -> string array -> Assoc.data_base -> unit = <fun>
Main Entry Point and Application
The application is a standalone executable that takes the port number as a
parameter. It reads in the database before launching the server. The main
function is obtained from the function print_answer defined above
and from the generic HTTP server function cgi_like_server defined
in the previous section (see page ??). The latter
function is located in the module Servlet.
# let
get_port_num()
=
if
(Array.length
Sys.argv)
<
2
then
1
2
3
4
5
else
try
int_of_string
Sys.argv.
(1
)
with
_
->
1
2
3
4
5
;;
val get_port_num : unit -> int = <fun>
# let
main()
=
let
db
=
Assoc.read_base
"assoc.dat"
in
let
assoc_answer
oc
q
=
print_answer
(new
Html_frame.print
oc)
q
db
in
Servlet.cgi_like_server
(get_port_num())
assoc_answer
;;
val main : unit -> unit = <fun>
To obtain a complete application, we combine the definitions of the display
functions into a file httpassoc.ml
. The file ends with a call to the
function main:
main() ;;
We can then produce an executable named assocd
using the compilation
command:
ocamlc -thread -custom -o assocd unix.cma threads.cma \
gsd.cmo servlet.cmo html_frame.cmo string_plus.cmo assoc.cmo \
httpassoc.ml -cclib -lunix -cclib -lthreads
All that's left is to launch the server, load the HTML page2 contained in the file
assoc.html
given at the beginning of this section (page
??), and click.
The figure 21.3 shows an example of the application in use.
Figure 21.3: HTTP request to an Objective CAML servlet
The browser establishes an initial connection with the servlet, which sends it
the menu page. Once the entry fields are filled in, the user sends a new
request which contains the data entered. The server decodes the request and
calls on the association database to retrieve the desired information. The
result is translated into HTML and sent to the client, which then displays
this new page.
To Learn More
This application has numerous possible enhancements. First of all, the HTTP
protocol used here is overly simple compared to the new versions, which add a
header supplying the type and length of the page being sent. Likewise, the
method POST, which allows modification of the server, is not
supported.3
To be able to describe the type of a page to be returned, the servlet would
have to support the MIME convention, which is used for describing documents
such as those attached to email messages.
The transmission of images, such as in figure 21.2, makes it
possible to construct interfaces for 2-player games (see chapter
17), where one associates links with drawings of
positions to be played. Since the server knows which moves are legal, only
the valid positions are associated with links.
The MIME extension also allows defining new types of data. One can thus
support a private protocol for Objective CAML values by defining a new MIME type.
These values will be understandable only by an Objective CAML program using the
same private protocol. In this way, a request by a client for a remote
Objective CAML value can be issued via HTTP. One can even pass a serialized closure
as an argument within an HTTP request. This, once reconstructed on the server
side, can be executed to provide the desired result.