© Anthony Grace

No rated * * * * * Resize -A   +A

 

CHAPTER 4:

CGI the Java way

 

4.1 Overview

Java servlets are a means of extending the functionality of servers; they are small pieces of Java code that the Web server loads to handle client requests. Unlike CGI applications, servlets remain in memory after the client requests terminate. The most common use for a servlet is to extend a Web server by providing dynamic Web content. These server-oriented classes offer system independence and better performance over the traditional CGI approach. Web servers were originally built to serve static Web pages and not much more. They lacked the means to access a database and to return the results of a query as HTML. Since then, the Web has undergone an explosive growth from a simple publishing medium to a sophisticated application environment. Today’s large retail sites for example, have to deal with constantly changing stock levels and prices, and often comprise entirely of dynamically generated pages, with no HTML files whatsoever. The new Java Servlet API, in conjunction with the Java Web Server, is a powerful tool for the construction of Web pages on-the-fly.

The introduction of CGI gave Webmasters the means to generate such HTML pages dynamically, as already stated. Web content could be customised for each user, who could then send information back to the HTTP server. However, CGI was added in an ad-hoc manner to the existing set of Web protocols, and as such, is somewhat unwieldly. Although CGI represented a major breakthrough in bringing the Web to life, it has many problems associated with it: interactive sites require the use of script languages like Perl and people who know how to use them; most of all, there is a marked degradation in server performance. With CGI, when a request is made to the Web server, that server then has to spawn a new process, pass any parameters to the CGI script running in that process, read the resulting HTML page from the process and then return it to the Web browser. This whole procedure was very inefficient. Although proprietary APIs and libraries have been introduced by Microsoft, Netscape and others in an effort to get around this, it means that any resulting code is tied to a single platform. The new Java Servlet API addresses the problems of speed and reliability. Servlets inherit the obvious advantages of the Java language such as cross-platform portability, garbage collection, multi-threading, synchronisation, performance and reliability.

Another drawback with CGI was its inherent complexity; user parameters were either appended as name-value pairs to the URL used to call the CGI script, or were included in HTML POST or GET requests. Again, the CGI script would have to parse these parameters, see to any processing, and finally write the resulting information, as well as the type of response, back to the HTTP server. Servlets get around this by providing parameters to the servlet in pre-parsed hash tables, providing easy access to user information. In addition, servlets can do things that applets cannot; they can write to files, and can open sockets - they can do this very quickly since they are invoked as threads in a daemon process. Using servlets, we do not have to worry about the inner workings of the server, FORM data, Server headers, Cookies, etc., are all handled for us by the Servlet API’s underlying classes.

4.2 Servlets

Servlets are Java classes loaded into and invoked by a Web server. They can be considered the server equivalent of applets on the browser side. Once loaded, they become part of the Web server, leading to improved performance since they do not have to be spawned at every request. As a result of becoming part of the Web server, servlets have the ability to save state between invocations. This is an important advantage in the stateless environment of HTTP.

A servlet works in a similar manner as a CGI page generator: the servlet class is loaded into the Web server, and in reply to an HTTP request to its host Web server, uses local information, parameters passed from the browser, etc., to construct an HTML page or some other type of response before returning that page to the server. Because the servlet runs in the same process and address space as its host Web server, there is a minimum of inter-process overhead.

Figure 18: Web server with embedded servlet

The ‘Write Once, Run Anywhere’ advantage that Java provides, means that JavaServer applications can be made to run on any platform. For example, we can develop our applications on a PC and run them on a UNIX server. Javasoft released JavaServer to support the Servlet APIs. It is written in Java, and can run on any machine which supports the Java Virtual Machine.

As well as being highly portable, Servlets also gives us a degree of independence from proprietary servers and protocols. Besides acting as possible replacements for CGI, they have other uses in an enterprise environment; we can now write our own customised servlets and ‘plug’ them into different vendors’ products such as Web servers ( HTTP ), file servers ( NFS ) and mail servers ( SMTP ) - this process is generally referred to as expanding the capabilities of ‘services’.

Since servlets stay resident in memory, they are very fast. Static or persistent information can be shared across several invocations, allowing many users to share information. In common with the Object Oriented paradigm, servlets are highly modular;
each servlet performs a specific task and we can then ‘chain’ them together. That is to say, by chaining servlets together we can get them to talk to each other, and co-operate to accomplish the goals we desire.

4.3 The Servlet Architecture

In order to develop servlets, we need the Java Servlets Development Kit ( JSDK ) which is freely downloadable, and to be able to run a servlet-enabled Web server. The basic interfaces and classes needed to develop servlets are contained in the javax.servlet and javax.servlet.http packages.

The central abstraction in the JSDK is the Servlet interface. All servlets implement this interface either directly or by extending a class that implements it such as HttpServlet. The Servlet interface declares, but does not implement, methods that manage the servlet and its interaction with clients. The following UML diagram illustrates this relationship:

 

Figure 19: The Servlet Interface

4.3.1 The javax.servlet package

The following package diagram gives a more detailed view of the interaction between the interfaces and classes that make up the javax.servlet:

 

Figure 20: javax.servlet Package Diagram

Note: Code stubs for all interfaces and classes can be found in the Appendices

 

Summary of the Architectural Elements

All servlets must implement the standard methods defined in the Servlet
interface. The server initialises a servlet by calling its init() method which passes it a ServletConfig object containing the servlet’s configuration and initialisation parameters. The server passes the servlet an object of this type when it is first loaded. When implementing this interface, we must write methods that the servlet can use to obtain its initialisation parameters as well as the context in which it is running. The getServletContext() method returns the servlet’s context in a ServletContext object.

The server invokes the service() method which accepts as parameters, a ServletRequest object for its input, and returns a ServletResponse object. Servlets run until the server destroys them; a server kills a servlet by calling its destroy() method. In some cases, when the server calls the destroy() method, there may be other threads still running service requests. This type of multithreading scenario will be discussed later.

The ServletContext interface lets a servlet find out about its environment as well as log significant events. It is up to the programmer to decide which data to log. This interface is implemented by servers and used by servlets. There are other methods which may: return the MIME type of the specified file, return the name and version of the server under which the servlet is running, etc.
Familiarity of these and other methods and attributes can be gained be gained through programming practise.

The GenericServlet helper class implements the Servlet and ServletConfig interfaces. It provides us with a simple implementation of init()and destroy() as well as the methods in the ServletConfig interface. GenericServlet also implements the log() method which is defined by the ServletContext interface. It is up to us to override and implement the abstract service() method.

The ServletRequest interface gives the servlet access to the names of the parameters passed in by the client, the protocol used by the client, and the
names of the remote host making the request as well as the server that received it.
Objects that implement this interface are passed as an argument of the service() method. The getInputStream() method returns an object of type ServletInputSteam which we can use to get data from clients that use protocols such as HTTP POST and PUT methods. getContentLength() returns the size of the request data, or if not known, returns a -1. Most of the method names are fairly self-explanatory. The getScheme() method returns whatever scheme the URL used for the request: for example, http, ftp, etc. The getServerPort() method returns the port number on which the request was received. Refer to the javax.servlet.http package section later.

The ServletResponse interface offers the servlet methods to return its results to the Web server. An object of this type is passed as an argument to the service() method. The getOutputStream() method returns an output stream whiich the servlet can use to write response data back to the client. The servlet does whatever processing is necessary to create an HTML page in response to the request and then writes the HTML code to the ServletOutputStream.

The ServletInputStream helper class provides an input stream for servlets to read incoming requests and data. This class provides us with a readline() method which allows us to read an array of bytes at a specified offset. This method reads into an array either the number of characters that we specify or the number of characters before a newline ( /n ) is met.

The ServletOutputStream helper class provides an output stream for servlets to write their responses, if any. We can write basic Java types such as booleans, characters, floats and ints to the stream by overloading the print() method. We can use the println() method to write the same basic types except that, in this case, they are followed by a CRLF.

 

4.3.2 The javax.servlet.http Package

The interfaces and classes of the javax.servlet.http package define servlet extensions for the HTTP protocol. The core of this package is the HttpServlet abstract class. Note: an abstract class contains unimplemented methods and cannot be instantiated by itself. Our servlet must extend it and override at least one of its methods.

The following package diagram gives a more detailed view of the interaction between the interfaces and classes that make up the javax.servlet.http package:

 

Figure 21: javax.servlet.http Package Diagram

 

 

Summary of the Architectural Elements

The HttpServlet abstract class extends the generic servlet and provides support for HTTP requests. We can override doPost() to handle HTTP POSTs, or alternatively, doGet() for handling data of a limited length. If we have to manage resources held over the lifetime of a servlet, we must override the inherited init() and destroy() methods. The service() method is declared public since it delegates calls to other methods defined in this interface. We use getLastModified() to assist in the caching of GET requests; it returns the time the requested entity was last modified.

The HttpServletRequest interface handles the communication from the client to the server. It offers us methods with which to extract HTTP header information. The getHeaderNames() method returns an enumeration of strings representing the header names for this request. The getMethod() returns the HTTP command. The getPathInfo() method returns the optional extra path that follows the servlet’s path in the request. If we want the extra path information translated to a real path, we use the getPathTranslated() method.

The HttpServletResponse interface handles the communication from the servlet back to the client. This interface defines helper methods to dynamically create an HTTP response, as well as HTTP constants for the return codes. We use sendRedirect() to return a redirect response to the client, specifying the new URL. If the containsHeader() has a response message header with a field name that we specify, it returns TRUE.

The HttpUtils class provides us with a useful set of HTTP helper methods. The getRequestURL() method takes an HttpRequest object as a parameter and returns the URL that the client requested. An important method in this class is parsePostData() which takes a ServletInputStream as its input. It then parses this input and returns the POST’s form data in a Hashtable of name / value pairs. The parseQueryString() works in a similar fashion for a query string, taking a String as its input parameter.

 

4.4 The Servlet Life Cycle

Our servlet must be installed in a servlet-enabled Web server such as Javasoft’s Java Web Server, which supports the servlet APIs. We can install servlets at startup as permanent features of the server. Optionally, we can install them dynamically, without having to restart the server. Installing servlets dynamically allows client applets to load servlets into the Java Web Server over a network.

The first thing to do is to make our servlet’s class files available to the Web server. The Java Web server loads servlet class files from one of three locations:

CLASSPATH - servlets loaded in this manner are equivalent to classes loaded into a Java application. They have the same permissions associated with being loaded from the local file system. They are loaded once; when they are first called.

Servlets directory - servlets are loaded from this special directory with a ClassLoader. These servlets can be dynamically reloaded while the server is still running.

URL - servlets can be loaded from a URL and are similar to those loaded from the servlets directory, and do not need to be located on the local machine. We can call servlets directly by entering their URL into a browser location window. Alternatively, servlet URLs can be used in HTML tags; as the destination of an anchor, as the action in a form, or as the location to be used when a META tag directs that a page be refreshed.

Note: In most situations where a change to a servlet-enabled Web server is being considered, there is probably an existing server already in place. Such an existing set up may have backend CGI scripts and add-on server software developed over several years, and for this reason there is usually a natural reluctance to change. However, this need not be of concern as JavaSoft provide server extensions for most of the popular Web servers including Apache. The JSDK ( JavaSoft Development Kit ) contains the actual extensions, which in effect, allow us to reap the benefits of servlets without totally scrapping our existing Web architecture.

A servlet engine can execute all of its servlets in a single JVM, thus permitting the efficient sharing of data with each other. In addition, they are prohibited by the inbuilt security of the Java language from accessing each other’s private data. The servlet life-cycle itself is highly flexible in its range of options for servlet support. The main thing to remember is that a servlet engine must conform to the following life cycle contract:

Create and initialise the servlet.

Handle zero or more service calls from clients.

Destroy the servlet and garbage collect it. Some servers
only perform this step on shutdown.

Figure 22: Servlet Life Cycle


The initialisation process was described in paragraph 4.3.1 above. Basically, a servlet’s init(ServletConfig) method is called by the server as soon as the server constructs the servlet’s instance. Depending on the server and its configuration, this can occur:

when the server starts

when the servlet is initially requested, just before the service() method
is invoked.

at the discretion of the system administrator.

As stated previously, the ServletConfig object contains the necessary initialisation and configuration parameters. It should be noted that these parameters are supplied to the servlet itself and are not associated with any specific request. In the case of the Java Web Server, the initialisation parameters are usually set manually during the registration process using special applets designed for this purpose. Different servers offer different ways to set initialisation parameters. A specific technique employed only by the Java Web Server involves treating servlets as JavaBeans. These servlets can then be loaded from serialised files. Alternatively, they can have their initialisation parameters set automatically by the server at load time using intropection; this will be looked at in the next section. A ServletConfig object may also investigate its environment by checking its reference to the ServletContext object.

Servlets run until either the server or the system administrator destroys them. The server destroys a servlet by running the destroy() method. This method has only to be run once. A servlet uses the destroy() method to free any resources it has acquired that would not usually be garbage collected. In addition, servlets use the destroy() method to write out any unsaved cached or persistent data that will need to be read during the next call to init().

 

4.5 Instance Persistence

Whenever a servlet is loaded, the server creates a single class instance which can handle every request made of that servlet. Such a servlet instance will persist between requests. This enhances performance by keeping memory requirements to a minimum, eliminating object creation overhead, and ultimately enabling persistence. For example, a database connection could be opened once and used repeatedly by a number of servlets, subject to synchronisation measures. Note that any threads created by servlets will also persist between requests.

Figure 23: Multi-thread Servlet Model

All real work is done by threads, and object instances are merely data structures manipulated by the threads. Each client is a thread that invokes the servlet via service(), doGet() and doPost() methods, as shown above. Attention to detail, as well as background processing, is needed when working with threads. Anyone who has programmed in C or C++ will be familiar with interprocess-communication issues such as mutual exclusion, semaphores, etc. Common sense tells us that if two threads are running at the same time, and if these two threads are sharing the same data structure or object instance, inconsistencies can result. If our data is being stored in local variables, then there is no need for concern regarding interaction among threads. On the other hand, when using non-local variables, care and attention is called for, as each client thread may have access to these variables. In such cases, it is possible to write our code ( particularly lines of code involving changes to stored data ) in synchronised blocks, using the synchronized keyword. Consider the following ( taken from ‘Java Servlet Programming’ published by O’Reilly ):

public void doGet(HttpServletRequest req, HttpServletResponse res)
{
PrintWriter out = res.getWriter();
int local_count;
synchronized(this)
{
local_count = ++count; // count has been defined non-locally
}
out.println(“Since loading, this servlet has been accessed “ + local_count + “ times.”);
………
}

Any code executed within a synchronised block or method cannot be executed concurrently by another thread. If another thread wishes to execute such code, it must first obtain a lock or ‘monitor’ on that code; otherwise, it must wait. This paradigm will be familiar to anyone used to database transaction processing. In any case, since this feature is largely handled by the Java language itself, it is relatively simple to use. One problem with the technique, however, is that it can cause an appreciable amount of overhead on some platforms. It also has to be noted that local variables are not persistent between requests. Therefore any coding techniques employed are determined by the specifics of the problem at hand. That is: what do we want to save and how often do we want to save it - how many other threads are likely to want to do the same thing at the same time, etc.

There are numerous ways for servlets to save their state: some might use some form of custom file format while others might save their state as serialised Java objects or use a database. Another popular database technique is that of ‘journaling’; here, the servlet’s full state is saved infrequently while incremental changes are recorded in a journal file on a more regular basis.

In the event of the server crashing, there can be extra problems to consider. Chief among these is the fact that the destroy() method will not be called! If the destroy() method does nothing more than free up resources, then this should not be a problem; rebooting will tidy things up. However, if the destroy() method includes code for saving state, then the only solution is for these servlets to save their state more often, perhaps using a journaling system such as that described above. Alternatively, their state could be saved on the completion of a transaction, as with a shopping cart system. If, however, the server crashes while in the destroy() method, we may be left with a partially-written file with garbage written on top of our previous state. Here again, the solution is for the servlet to save its state to some form of temporary file and to copy that file on top of the official state file, in a single command.

The ‘one instance per servlet’ model as depicted above is somewhat of an over-simplification. In reality, each servlet’s registered name, but not its alias, is associated with one instance of the servlet; that is, the actual name itself determines which instance is responsible for handling the request.

 

4.6 The Single Thread Model

Rather than having just a single servlet instance for each registered name, it is also possible for a servlet to have a pool of instances for each of its registered names. Each of these instances can share the handling of client requests. In order to use this alternative form of life cycle, the servlet must implement the empty javax.servlet.SingleThreadModel tag interface:

public interface SingleThreadModel { }

According to the Servlet API documentation, any server that loads a SingleThreadModel servlet must guarantee “that no two threads will execute concurrently the service method of that servlet”. In practise, this means that when a servlet implements this interface, the server will ensure that each instance of the servlet only handles a single service request at a time. The server will maintain the pool of servlet instances and dispatch any incoming requests to free servlets within the pool. The implications of this mean that any of our servlets implementing this interface are thread safe and will not need synchronised access to their instance variables. This form of life cycle is perfect for transaction-based database access, where several database commands can form a single atomic transaction. By having just a single ‘connection’ instance variable for each servlet, concurrent requests to a servlet can be easily handled by allowing the server manage the pool of servlet instances.

 

Figure 24: The Single Thread Model

The skeleton code for handling database connections with the SingleThreadModel interface is shown below:

 

import java.io.*;
import java.sql.*;
import java.util.*;
import java.servlet.*;
import java.servlet.http.*;

 

 

 

 

Public class SingleThreadConnection extends HttpServlet implements SingleThreadModel
{
Connection con = null; //One connection per servlet instance
Public void init(ServletConfig config) throws ServletException
{
super.init(config);
con = establishConnection();//Establish a single connection
con.setAutoCommit(false);
}

public void doGet(HttpServlet req, HttpServlet res)
throws ServletException, IOException
{
res.setContentType(“text/plain”);
PrintWriter out = res.getWriter();

//Use the connection uniquely assigned to this instance
Statement stmt = con.createStatement();

//Update the database

//Commit the transaction
con.commit();
}

public void destroy()
{
if(con != null)con.close()
}

private Connection establishConnection()
{
//Transaction management
}
}

 

4.7 Page Generation

The most basic form of HTTP servlet is used to generate a full HTML page. The same type of information available to CGI scripts is also available an HTTP servlet. Like a CGI script, a servlet that generates an HTML page can be used for such tasks as processing HTML forms, reporting from a database, etc. The following HTTP servlet example generates a complete HTML page that simply prints “Hello World” every time it is accessed by a client browser:

 

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class HelloWorld extends HttpServlet
{
public void doGet(HttpServletRequest req,
HttpServletResponse res)
Throws ServletException, IOException
{
res.setContentType(“text/plain”);
PrintWriter out = res.getWriter();
out.println(“<HTML>”);
out.println(“<HEAD><TITLE>Hello World</TITLE></HEAD>”);
out.println(“<BODY>”);
out.println(“<BIG>Hello World</BIG>”);
out.println(“</BODY></HTML>”);
out.close();
}
}

This servlet extends the HttpServlet class and overloads the doGet() method which it inherits. Every time the server receives a GET request for this servlet, it calls this doGET() method passing the HttpServletRequest and HttpServletResponse object parameters. The request object enables the server to obtain information about the client, the parameters for this request, as well as any HTTP headers passed. The response object forms the servlet’s response to the client, and can be used to return data. The type of data being returned should be specified; text, images, etc.
When responding to the client, the doGet() method uses a Writer from the HttpServletResponse object. At the end of the doGet() method, after the response has been sent, the Writer is closed. POST requests are handled in a similar manner.

4.8 Servlet Chaining

It does not always make sense to focus all of our processing in one place. Many servers that support servlets allow a request to be handled by a sequence of servlets in a process called chaining. The client browser sends its request to the first servlet in the chain. The output from each servlet in the chain is then piped as input to the next servlet, and each of these servlets, in turn, has the option of changing or extending the content. The output from the last servlet in the chain gets sent back to the browser.

 

Figure 25: Servlet Chaining

 

There are two main ways to to initiate a chain of servlets for an incoming request. We can explicitly tell the server that certain URLs are to be handled by a specified chain. Alternatively, we can instruct the server to pipe all output of certain content type through a particular servlet before returning it to the client, perhaps even converting it to a different content type; this technique is known as filtering. A point to note is that the initial content does not necessarily have to come from a servlet. I could come from a CGI script. For example, the Java Web Server does not need to make this distinction as all of its requests are handled by servlets. In fact, JWS uses servlets for everything, including a file servlet for sending HTML and graphics, etc.

As regards servlet headers, a servlet in the middle or at the end of a servlet chain reads header values from the previous servlet’s response and not from the client’s request. A servlet has the ability to process the body content and the header values from the output of the previous servlet. It can add extra headers or change the the values of existing headers. Further, it can surpress the previous servlet’s headers. Unless it has a specific reason to do otherwise, a well-behaved servlet always passes on the previous servlet’s headers.

Every request handled by a servlet has an input stream associated with it. Data read from the input stream can be of any content type and length. The input stream has three main purposes:

To pass the response body from a preious servlet to the
next servlet in the chain.

To pass an HTTP servlet the content associated with a
POST request.

To pass a non-HTTP servlet any raw data sent by a client.

A useful implementation for a servlet chain might be to generate a standardised header and footer for each page of our Web site.We could thus ensure that every HTML page in our site has an up-to-date header / footer combination. This is not too much of a problem for a small site, but for a site consisting of hundreds of pages a more systematic and consistent approach is needed.

The following example ( taken from Webreview Nov. 1997 ) presents a chained servlet that adds a header to any HTML content passed through it. The servlet is then chained to the JWS’s file servlet.

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;

public class HeaderServlet extends HttpServlet
{
public void service(HttpServletRequest req,
HttpServletResponse res)
throws ServletException, IOException
{
ServletOutputStream out = res.getOutputStream();
ServletInputStream in = req.getInputStream();

res.setContentType(req.getContentType());

if(req.getContentType().equals(“text/html”))
{
out.println(“<HTML><HEAD>”);
out.println(“<TITLE>BugFix Consultancy</TITLE>”);
out.println(“</HEAD><BODY BGCOLOR=#FFFFFF>”);
out.println(“<HR>”);
}

byte[] b = new byte[100];

while(in.read(b)>0)
out.write(b); //send the original content back
out.flush();
}
}

We could have derived the above filter from either the GenericServlet or HttpServlet. This example uses the slightly lighter-weight GenericServlet class, which can also handle non-HTTP data. After obtaining ServletInputStream, we check to see if we were dealing with HTML, and if so, we pass the header to the output stream and follow it with the input stream. If it is not HTML, we pass the input stream to the output stream and exit. After installing the servlet in the Java Web Server ( see server documentation for details ), whenever a file other than an alias is requested by a browser, it will be routed through this servlet first. Thus any files of content type “text/html” will have the “BugFix Consultancy” header attached; all other files will be simply passed through.

Servlet chaining is a whole new approach to the creation, maintenance and updating of Web sites. By chaining servlets together we can easily change the appearance of pages or their type of content. We could, for example, use a servlet in a filter role to convert unsupported image types to GIF or JPEG. Compared with scripts, servlet chains can be easily undone or adapted.

 

4.9 Java Database Connectivity ( JDBC )

Most professional Web sites today have some form of database connectivity, even if they do little more than run SELECT statements and insert single pieces of data. Web site front ends have been connected to a plethora of legacy database systems. This said, most forms of database interaction comes with a significant performance overhead. Despite this, most commercial Web sites today are database driven.

JDBC is Sun’s Java extension API for database access. It provides us with a database-independent connectivity API, which can be used in conjunction with servlets to develop Web sites that are easily integrated with backend databases. Probably the biggest advantage of using servlets is the fact that they maintain open database connections across multiple requests. In real terms, this can shave several seconds off of the response time when compared to CGI scripts which have to reconnect each time. By far the biggest advantage of JDBC is the fact that it is database-independent. This means that we can substitute our site’s database with one from another vendor with a minimum of hassle. JDBC is a SQL-level API, and is a standard part of the JDK 1.1. The API consists primarily of the java.sun.sql package which is just a set of interfaces and classes that can interoperate with almost any database. In fact, much of the API is made up of database-neutral interfaces that specify behaviour with implementation. The implementations are provided by third-party vendors ( usually at a price ). The JDK ships with a free JDBC-ODBC bridge which enables the user to connect to a Microsoft Access database. Other drivers, usually 100% Java, are available for download from the various manufacturers. For a list of currently available drivers, check out:

http://java.sun.com/products/jdbc/jdbc.drivers.html

Access to a particular database system is achieved through the use of a specific JDBC driver which implements the java.sql.Driver interface. The following is a sample JDBC servlet:

try
{
Class.forName(“postgresql.Driver”);
Connection con = DriverManager.getConnection(
“jdbc:postgresql:dbname”,”user”,”passwd”);
Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery(
“select Name from customers”);

out.println(“<UL>”);
while(rs.next())
{
out.println(“<LI>”+rs.getString(“name”));
}
out.println(“</UL>”);
rs.close();
stmt.close();
con.close();
}

catch(SQLException)
{
out.println(“An SQL Exception was thrown.”);
}

The first step is to load the specific driver class into the application’s JVM, using the Class.forName() method. The driver is then available whenever we wish to open a connection to the JDBC URL of the database. JDBC URLs provide us with a unique means of identifying databases. The format is:

jdbc:drivername:databasename

Once a Connection is established, we can create a Statement, which is used to execute an SQL query which returns a ResultSet. We can think of the ResultSet object as the query result being returned one row at a time. We can then use the next() method to move from one row to the next. ResultSet is related to its parent Statement. Thus if a Statement is closed or used to execute some other query, any related ResultSet object will be closed automatically. It is possible to use the ResultSetMetaData interface to learn about the structure of a query result on the fly, and to use this information to dynamically generate an HTML table to display the results.

There are many advanced features associated with the JDBC API, too detailed to include here. The first place to look for information should be the Sun website.


4.10 JavaServer Pages ( JSP )

Very similar to Microsoft’s Active Server Pages ( ASP ), JavaServer Pages are a new means of using servlets. JSP works in a similar fashion to server-side includes, except that instead of embedding a <SERVLET> tag in our HTML page, JSP actually embeds snippets of servlet code. This offers us a highly flexible means of constructing Web pages that may consist of intermingled dynamic and static content. JSP does not require any changes to the Servlet API. However, it does require support in our Web server and this is included in the Java Web Server 1.2.

With JSP, we can embed servlet code directly into a static HTML file. A block of servlet code, called a scriplet, is always surrounded by a leading <% tag and a closing %> tag. For simplicity, four variables have been pre-defined for scriplet use:

request: The servlet request, an HttpServletRequest object

response: The servlet response, an HttpServletResponse object

out: The output writer, a PrintWriter object

in: The input reader, a BufferedReader object

The following example uses some of these pre-defined variables to produce an ‘Hello’ message. The user can enter their name as a parameter on the Location bar as follows:

http://localhost:8080/hello.jsp?name=Anthony

To get the example to work, it should be saved with the special .jsp extension, and placed under the server’s document root, say server_root/public.html.

<HTML>
<HEAD><TITLE>Hello</TITLE></HEAD>
<BODY>
<%
if (request.getParameter(“name”) == null)
{
out.println(“Hello World”);
}
else
{
out.println(“Hello, ” +
request.getParameter(“name”));
}
%>
</BODY></HTML>
Behind the scenes, the server automatically creates, compiles, loads and runs a special servlet to display the contents of the above HTML page. The following diagram illustrates this:

Figure 26: Generating JavaServer Pages

 

This special servlet, created by the server, acts as a sort of workhorse servlet which generates the static parts of the HTML page using the equivalent of out.println() calls. The dynamic parts of the HTML page are included directly. When we first access a JSP page, there is a perceptible delay before getting a response. This initial delay is the time it takes for the server to create and compile the background servlet. If the .jsp file changes, the server notices and will recompile a new background servlet. The following code serves as an example of what the background workhorse servlet for hello.jsp might look like:

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class _hello_xjsp extends HttpServlet
{
public void service(HttpServletRequest req, HttpServletResponse res)
throws ServletException, IOException
{
res.setContentType(“text/html”);
PrintWriter out = res.getWriter();
BufferedReader in = req.getReader();

out.println(“<HTML>”);
out.println(“HEAD><TITLE>Hello</TITLE></HEAD>”);
out.println(“<BODY>”);
out.println(“Hello, “ + getName(req));
out.println(“</BODY></HTML>”);
}

private static final String DEFAULT_NAME = “World”;

private String getName(HttpServletRequest req)
{
string name = req.getParameter(“name”);
if (name == null)
return DEFAULT_NAME;
else
return name;
}
}

 

4.10.1 Expressions and Directives

JavaServer Pages also introduce the concepts of expressions and directives. JSP expressions begin with a <%= tag and end with a %> tag. Any Java expression found between these tags is evaluated, converted to a string, and the resulting text embedded directly into the page.

JSP directives begin with a <%@ tag and end with a %> tag. JSP pages use these directives to control certain aspects of the workhorse servlet. For example, they can be used by the workhorse servlet to set its content type, import packages, extend a different superclass, implement interfaces, and handle GET or POST requests. In addition, directives can be used to specify the use of non-Java scripting languages.

There are six key variables which can be assigned between the directive tags in the following format:

<%@ varname = “value” %>

The six key variables are as follows:

content_type: This specifies the content type of the page. For example:

<%@ content_type = “text/plain” %>

import: This specifies a list of classes to be imported by the servlet. We can import multiple classes by by putting them in a comma-separated list. Alternatively, multiple import directives can be used. The default is “text/html”. For example:

<%@ import = “java.io.*, java.util.Hashtable” %>

extends: This specifies the superclass to be extended by the servlet. The default is HttpServlet. For example:

<%@ extends = “CustomHttpServletSuperclass” %>

implements: This specifies a list of interfaces to be implemented by the servlet. We can implement multiple interfaces by naming them in a comma-separated list, or by using multiple import directives. For example:

<%@ implements = “Serializable” %>

method: This specifies the servlet method which is to contain the generated code and handle client requests. The default is “service” which handles all requests. Note that the default behaviour is not to implement anything. For example:

<%@ method = “doPost” %>

language: This specifies the scripting language used by the back-end. The default language here is Java. Some servers permit the use of other languages. For example:

<%@ language = “java” %>

The following example is a revised version of the hello.jsp page. It illustrates the use of the method directive to enable the handling of POST requests, and expression to simplify the display of the name parameter:

<%@ method = “doPost” %>
<HTML>
<HEAD><TITLE>Hello</TITLE></HEAD>
<% if (request.getParameter(“name”) == null) { %>
Hello World
<% } else { %>
Hello, <%= request.getParameter(“name”) %>
<% } %>
</BODY></HTML>

When the background workhorse servlet for the above JSP code is generated, it will be almost identical, except that instead of implementing the service() method, it implements the doPost() method.

4.10.2 Declarations

From time to time, a JSP page may have to define methods and non-local variables in its workhorse servlet. In order to do this, it uses a construct called a JSP declaration. Such declarations begin with a <SCRIPT RUNAT = “server”> tag and end with a </SCRIPT> tag. Between these tags we place all servlet code that would not normally appear in the service() method. The following code example demonstrates the use of a declaration in our hello.jsp page to define the getName() method:

<HTML>
<HEAD><TITLE>Hello</TITLE></HEAD>
<BODY>
hello, <%= getName(req) %>
</BODY>
</HTML>

<SCRIPT RUNAT = “server”>
private static final String DEFAULT_NAME = “World”;

private String getName(HttpServletRequest req)
{
String name = req.getParameter(“name”);
if (name == null)
return DEFAULT_NAME;
else
return name;
}
</SCRIPT>

The background workhorse servlet created to generate this page might look like the following:

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class _hello2_xjsp extends HttpServlet
{
public void service(HttpServletRequest req,
HttpServletResponse res)
throws ServletException, IOException
{
res.setContentType(“text/html”);
PrintWriter out = res.getWriter();
BufferedReader in = req.getReader();

out.println(“<HTML>”);
out.println(“<HEAD><TITLE>Hello</TITLE></HEAD>”);
out.println(“<BODY>”);
out.println(“Hello, ” + getName(req));
out.println(“</BODY></HTML>”);
}

private static final String DEFAULT_NAME = “World”;

private String getName(HttpServletRequest req)
{
String name = req.getParameter(“name”);
if (name == null)
return DEFAULT_NAME;
else
return name;
}
}

4.11 JavaServer Pages and JavaBeans

The combination of JavaServer Pages and JavaBeans components introduces a new and powerful development paradigm. Sun’s operational definition is “ a Java bean is a reusable software component that can be manipulated visually in a builder tool”. JavaBeans are reusable Java classes whose methods and variables follow certain naming conventions. In addition, they can be embedded directly in JavaServer Pages using special <BEAN> tags. Typically, a JavaBean performs a single, focused task. For example, it may execute database queries or maintain information about the client. It can then makes its resulting information available to a JSP page via simple accessor methods.

A JavaBeans component embedded in a JSP page can be selectively treated by the Web server. For instance, a bean’s properties can be set automatically by the server using the request’s parameter values. The server uses introspection to determine if the bean has a name property and a setName(String name) method. If it has, then the server can automatically call the setName() method with the name parameter, rather than call the getParameter() method. Introspection is a technique which allows the methods and variables of a Java class to be determined programatically at runtime. This will be looked at in the next section.

The JavaBeans component model has a well-defined API, and when used in conjunction with JavaServer Pages, dramatically reduces the amount of code necessary in the page. This has the effect of neatly separating content ( functionality of the bean ) from presentation ( HTML structure of the page ). A bean can also have its scope automatically controlled by the server. A bean can also be assigned to either a specific request, or to a client session. With the former, it is used once and destroyed or recycled. With the latter, it is automatically made available every time the client reconnects. A bean can even be implemented as a servlet. For instance, if a bean implements the javax.servlet.Servlet interface ( either directly or by extending GenericServlet or HttpServlet ) , the server will detect this and will call the bean’s service() method once for each request. It will also call the bean’s init() method when first created. This can only be done with some servers, such as the Java Web Server. Such servlets may also be loaded from serialised files.

Beans are embedded in JSP pages using the <BEAN> tag. The syntax is as follows:

<BEAN NAME=“lookup name” VARNAME=“alternate variable name” TYPE=“class or interface name”
INTROSPECT=”{yes|no}” BEANNAME=“file name”
CREATE=“{yes|no}” SCOPE=“{request|session}”>
<PARAM property1=value1 property2=value2>
</BEAN>

Summary of the attributes of the <BEAN> tag which can be set:

NAME: This specifies the name of the bean. It is also the key under
which the bean is saved if its scope extends across several requests. If a bean
instance saved under this name exists in the current scope, then that instance is
used with this page. For example:

NAME = “userPreferences”

VARNAME: This specifies the variable name of the bean, and is used by
the page to refer to the bean and to call its methods. If this is not supplied, then the variable name of the bean is set to the value of its name attribute. For example:

VARNAME = “prefs”

TYPE: This specifies the name of the bean’s class or interface type. The default is java.lang.Object. For example:

TYPE = “UserPreferencesBean”

INTROSPECT: This specifies whether or not the server will set the bean’s properties to the parameter values in the client’s request. The default is “yes”.

BEANNAME: This specifies the serialised file or class file that contains the bean. It is used when the bean is first created. This is an optional attribute.
For example:

BEANNAME = “hellobean.ser”

CREATE: This specifies if the bean should be created if it does not already exist. The default is “yes”.

SCOPE: This specifies if the bean should be assigned to a specific request, or to a client session. The default is “request”.

Parameters are passed to a bean as a list using a <PARAM> tag between the opening <BEAN> and the closing </BEAN> tag. Parameter values are used to set the bean’s properties via introspection. The following example demonstrates the use of a JavaBeans component with a JSP page:

<%@ import = “HelloBean” %>

<BEAN NAME=“hello” TYPE=“HelloBean”
INTROSPECT=“yes” CREATE=“yes” SCOPE=“request”>
</BEAN>

<HTML>
<HEAD><TITLE>Hello</TITLE></HEAD>
<BODY>
Hello, <%= hello.getName() %>
</BODY>
</HTML>

As can be seen with the above example, it is short and simple to use. People with very little programming experience could soon get to grips with the syntax, and produce Web pages quickly.

 

4.12 Applet-Servlet Communication

There are several techniques by which applets can communicate with servlets. There are also several familiar day-to-day examples in which this form of communication
is employed. For example, there is the administrative applet that manages the Java Web Server; it executes on the client but it configures the server. In order to do this, the configuration applet must be in constant communication with the server. A popular application of this technology today is the familiar chat rooms on the internet. Here, each applet posts its messages to a central server, and the server then takes care of updating all of the other clients. A more critical application would be the tracking of stock exhange prices and the update of same on a continuous basis. In this sort of situation, it is imperative that the applet must continually talk with the server, in a secure and speedy manner. A stock tracking applet would need a constant stock feed from a server. If it is a bog-standard ‘untrusted’ applet, the only choice is the machine from which it was downloaded in the first place! If such an applet tried to connect to some other machine, it would result in a SecurityException being returned.

Whenever an applet is embedded in a Web page, a browser can download it and then execute it. This could have disastrous implications for any machine that might download malicious applets. With this in mind, all applets were deemed ‘untrustworthy’ under JDK 1.0. With this release of JDK, all applets were run under the watch of the SecurityManager, and as a result were very restricted in what they could do. Here, applets could not write to the client’s file system, accept incoming or initiate outgoing socket connections to any but the origin server, etc. This was not a satisfactory situation, because although it may have protected the client, it really limited the usefulness of applets. Then, with the release of JDK 1.1, ‘trusted’ applets were introduced. These trusted applets can operate like any other application, having full access to the client machine. To make an applet ‘trusted’, its code has to be digitally signed by a person or company that the client knows and trusts - this can be marked in the client browser. This way, the origin of that code can be established in a cryptographic and unforgeable manner.

With the more recent JDK 1.2 release, a more flexible, fine-grained access control system has been adopted. Now, a digitally signed applet can be partially trusted, and can be given limited privileges without having free reign on the client machine. Under this new access system, an unknown applet might be able to obtain limited priviliges such as being able to write to a local directory. However, they could still be prevented from wiping out the entire contents of the hard drive.

Lets say, for example, that the applet is going to get its stock information from the same server from which it was downloaded; there are a number of communication options open to it. Prior to JDK 1.1 and servlets, there were two options for applet-server communication: establish an HTTP connection between the applet and a CGI program on the server, or, get the applet to establish a raw socket connection to a non-HTTP server. With the latter scenario, the non-HTTP server listens to a particular port and communicates with the applet using a mutually agreed protocol; this method fails for applets operating behind firewalls and browsers cannot conect up to a non-HTTP server.

4.12.1 Servlets and Object Serialisation

Object serialisation was one of the important new features of JDK 1.1, and combined with Java servlets, has given new life to applet-server communication. The basic concept of serialisation, while new to Java, has been around a long time. In the world of object-oriented programming, it has been used for years to store and retrieve objects. It is important to make a distinction here: storage and retrieval only involves object data, not object code. Serialisation has nothing to do with Java classes; it applies only to the state of those classes after they have been instantiated as objects. Therefore, when we serialise an object, we are actually saving the fields of the object while the code stays behind. Java serialisation is very useful for file storage, network communication or any situation where object data has to be passed around. Above all, serialisation is of benefit to the user of JavaBeans rather than the programmer. The user can build Java applications with mere mouse clicks. These topics will be looked at in more detail in the next section.

As stated before, servlets have improved the performance of HTTP-based applet-server communication and are starting to replace their slower-starting CGI counterparts. Although an applet and servlet will still take some time to re-establish their connection for each request and response, it is no longer necessary for the applet to wait around while the server launches a CGI script to handle each of its repeated requests.

It is to be expected that both applets and servlets would want to communicate by exchanging Java objects, since they are both written in Java. Java object serialisation is excellent when it comes to sending formatted responses. It is convenient and it provides easy type safety. For instance, our stock tracking applet can ask our stock feed servlet the daily high for a particular stock and receive a response as a serialised StockPrice object. The fields of this object will contain the data required.

It should be noted that object serialisation only works with applets running in browsers that support JDK 1.1 or later. Note also that not every Java object can be serialised. If a bean stores an object as a field, then it cannot be serialised; this can cause major problems when the bean is loaded into a builder tool. Another point to note is that because Java threads are not serialisable, extra care has to be taken when writing multi-threaded beans. Any unexpected change in a variable due to serialisation could corrupt the thread. Finally, sensitive information such as passwords should not be serialised, unless they are encrypted beforehand.

 

4.12.2 JDBC, RMI and CORBA IDL

As stated already, JDBC allows a Java program to connect to a relational database on the same machine, or on a remote machine. Java applets written to JDK 1.1, or later, can use JDBC to communicate with a database on the server. Although this type of communication is not entirely necessary, it is often convenient for an applet to connect to a servlet that will handle the database connection on its behalf. For example, if an applet wanted to look up a person’s address, it could connect to a servlet using HTTP and pass the name as a HTTP parameter. It could then receive the response containing the address as a pre-formatted string or serialised object.

The RMI ( Remote Method Invocation ) API enables easy development of distributed Java applications. RMI is meant explicitly for Java on both the client and server sides. Using RMI, an applet can invoke methods of a Java object executing on the server machine, and in some cases allows for callbacks; that is, it allows objects on the server to call the methods of the applet. For example, in a stock tracker scenario, the server could notify interested applets if a stock price has changed by calling applet.update(stock).
In order to create a distributed application, all that has to be done is to define some remote interfaces. When these interface specifications are coded, the clients and servers implement those interfaces. RMI calls are then sent from the client to the server. The underlying details of how the calls are sent is hidden. That is to say, neither the client nor the server needs to know how the information is sent across the network. Client applets connect to the server, obtain handles on the server objects, and register their existence.

Although it simple to use, RMI is complicated; RMI communication uses special stub and skeleton classes for each remote object. A special naming registry is also required, from which clients can obtain references to these remote objects. Disregarding the overhead for stub and skeleton code, the actual connection between client and server can be implemented in less than five lines of code. By a connection here, we mean the process by which the client obtains a handle to an object on the server. Once the handle is obtained, we can consider the connection established. The client is then free to use the object as it would any other. If there is a network fault, then a remote exception is thrown, gracefully informing the client that the server is no longer available.

When the client requests a handle on an object on the server, it is the object itself that is sent over the wire. In addition, any objects contained within the requested object also get pushed over to the client. It should be noted that, despite the term ‘Remote Method Invocation’, the remote method is sent from the server with the full object to the client before the method actually gets invoked. The process of sending an object ( or method request ) from one computer to another is termed ‘marshaling’. Through serialisation, objects are treated as a stream. These object streams conform to the same interface as file input and output streams. These streams, in turn, are just like input and output streams for sockets. In effect, the object is actually written out across the network as a stream. The receiving node reads the stream and re-assembles the object into memory.

CORBA, or Common Object Request Broker Architecture, is an open standard which defines a ‘software bus’ to provide a means of communication between software applications regardless of implementation language or platform. As such it represents the most important and the most ambitious middleware project ever undertaken in the software industry.

CORBA enables the declaration, in a common language called the Interface Definition Language, or IDL, of the interface that a piece of software wants to make available to other pieces of software called objects. The communication between objects in the system is achieved through invoking methods defined by the object’s interface using the ORB as a broker and the TCP/IP based GIOP as a communications protocol. IIOP ( Internet Inter-ORB Protocol ) is a subset, or specialisation, of GIOP for TCP/IP and is the standard for interactivity between ORBs on the Internet. The declared interfaces may be compiled into code stubs in many different languages using different IDL compilers. These code stubs form the framework that enables CORBA based clients and servers to be implemented. With CORBA and the IIOP comunication protocol, a C++ client can communicate with a Java servlet. Since CORBA allows many different computer languages to interoperate, we can choose an optimal language for solving a particular piece of a problem without having to use that language in places where other computer languages excel. In Java, we can define remote interfaces in IDL and use Java mapping services to generate the stubs and interfaces. Our JavaBeans client component can then invoke the methods of IDL objects on remote servers.

To sum up, we have outlined the following distribution mechanisms:

JDBC
Java RMI
Java IDL

There is another alternative, and that is to use a ‘hybrid’ approach. If we can guarantee that our potential clients will support it, we could stick with RMI as an elegant and powerful solution. When RMI is not available, we may be tempted by the bidirectional capabilities of the non-HTTP socket connection. However, this will be of little use if the applet ends up on the far side of a firewall. In this case, we may fall back on the old reliable HTTP solution. It is simple to implement and works on every Java capable client. Also, if we can guarantee that the client supports JDK 1.1, we can use object serialisation.

Ultimately, the best solution might be to use all of these solutions, with servlets. We can combine the HTTP, non-HTTP, and RMI applet-server communication techniques, supporting all of them with a single servlet. That is, one servlet, multiple access protocols - depending on the circumstances. By using the same servlet to handle every client, the core server logic and and the server state can be collected in one place.

4.13 Summary

Since this document is primarily a tutorial and its aim is to be as broad as possible in its scope, not every aspect of the subject matter could be covered in any great detail. For example: session tracking, security, interservlet reuse, etc., all of which can be easily implemented with servlets. Details of these and other related topics can be readily researched via the Internet. The following URL is a good starting-off point:

http://java.sun.com/

Java servlets provide a number of enhancements to the current CGI architecture. Writing a CGI script in Perl may give it a semblance of platform independence, but it also requires that each request start a separate Perl interpreter. This takes even more time and requires extra resources. Because a CGI script runs in a separate process, it cannot interact with the Web server or take advantage of the server’s abilities once it starts running. For instance, it cannot write to the server’s log file. Unlike CGI, which has to spawn multiple processes to handle separate requests, servlets are all handled by separate threads within the Web server process. This means that servlets are more efficient and scaleable than CGI scripts. More important still, servlets are portable, across both operating systems and Web servers.

We have listed the advantages of servlets above. Now it is time to take a realistic look at the some of the shortcomings of servlets. First of all, servlets are slow. Although they are an order of magnitude faster than CGI, they are slower than most other alternatives. We do not show the benchmarks here, but these can easily be verified by the reader. A simple count program could be designed and tested with the appropriate software and hardware, over a known network configuration. For example, tests have shown that servlets are approximately 15 times slower than CORBA static invocations. Bear in mind that CORBA ORBs are not themselves noted for their performance!

A common place for servlets in a three tier architecture is in the middle tier - commonly referred to as middleware. The middle tier lies between the client and the ultimate data source, and can include business logic. Business logic abstracts such complicated low-level tasks as database updates into high-level tasks like placing an order. This has the effect of making the whole operation simpler and safer. Middleware can improve efficiency by spreading the processing load across several back end servers
- CPU servers, database servers, file servers, etc. However, like CGI and sockets, servlets provide a very primitive form of middleware. We have to do our own marshaling and unmarshaling of parameters. Servlets do not support typed interfaces, so we must create our own command formats. For example, suppose we had to implement a simple server that exposes a dozen interfaces to its clients. In addition to this, say each interface consists of a dozen methods that each require five or six parameters with different data types. Something this simple could quickly become non-trivial using servlets.

Servlets are about one notch better than CGI or sockets in the Web middleware hierarchy.They provide a simple callback API to the Web server. They also include some useful helper functions that can extract HTTP name / value pairs and construct a dynamic HTTP response. However, it is a very limited distributed object solution, as it does not take advantage of interfaces to provide higher levels of abstraction to the services we write. Servlets lack many important features we would expect from a scalable server-side component architecture. For instance they do not support transactions which are a standard feature of the CORBA / Enterprise JavaBeans server-side component model. We will take a closer look at this relatively new EJB component architecture in the next section.

Review Quiz

  Assignments

[Your opinion is important to us. If have a comment, correction or question
pertaining to this chapter please send it to comments@peoi.org.]

Previous: Common gateway interface (CGI)

Next: Enterprise JavaBeans (EJB )