5. Web programming: A crash course

You are probably reading this in a web browser, so you are likely to be at least a little familiar with the World Wide Web. This chapter contains a quick, superficial introduction to the various elements that make the web work, and the way they relate to JavaScript through jQuery. The three after this one show some of the ways jQuery can be used to inspect and change a web-page.

5.1. Clients and Servers

The Internet is, basically, just a computer network spanning most of the world. Computer networks make it possible for computers to send each other messages. The techniques that underlie networking are an interesting subject, but not the subject of this book. All you have to know is that, typically, one computer, which we will call the server, is waiting for other computers to start talking to it. Once another computer, the client, opens communications with this server, they will exchange whatever it is that needs to be exchanged using some specific language, a protocol.

The Internet is used to carry messages for many different protocols. There are protocols for chatting, protocols for file sharing, protocols used by malicious software to control the computer of the poor schmuck who installed it, and so on. The protocol that is of interest to us is that used by the World Wide Web. It is called HTTP, which stands for Hyper Text Transfer Protocol, and is used to retrieve web-pages and the files associated with them.

In HTTP communication, the server is the computer on which the web-page is stored. The client is the computer, such as yours, which asks the server for a page, so that it can display it. Asking for a page like this is called an HTTP request and the exchange of messages between the client (usually a web browser) and the server is called the request-response-cycle.

5.2. URLs

Web-pages and other files that are accessible though the Internet are identified by URLs, which is an abbreviation of Universal Resource Locators. A URL looks like this:

It is composed of three parts. The start, http://, indicates that this URL uses the HTTP protocol. There are some other protocols, such as FTP (File Transfer Protocol) and SSH, which also make use of URLs. The next part, acc6.its.brooklyn.cuny.edu, names the server on which this page can be found. The end of the URL, /~phalsal/texts/taote-v3.html, names a specific file on this server.

Most of the time, the World Wide Web is accessed using a browser. After typing a URL or clicking a link, the browser makes the appropriate HTTP request to the appropriate server. If all goes well, the server responds by sending a file back to the browser, which shows it to the user in one way or another.

5.3. HTML

HTML stands for HyperText Mark-up Language. An HTML document is all text. Because it must be able to express the structure of this text, information about which text is a heading, which text is purple, and so on, a few characters have a special meaning, somewhat like backslashes in JavaScript strings. The “less than” and “greater than” characters are used to create HTML tags. Most tags occure in pairs, with a start tag and an end tag, with text data between them. The start and end tag together with the enclosed text form an HTML element.

Elements provide extra information about the data in the document. They can stand on their own, for example to mark the place where a picture should appear in the page, or they can contain text and other elements, for example when they mark the start and end of a paragraph.

Some elements are compulsory, a whole HTML document must always be contained in an html element. Here is an example of an HTML document:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>A quote</title>
</head>

<body>
<h1>A quote</h1>

<blockquote>
<p>The connection between the language in which we think/program and the
problems and solutions we can imagine is very close.  For this reason
restricting language features with the intent of eliminating programmer
errors is at best dangerous.</p>

<p>-- Bjarne Stroustrup</p>
</blockquote>

<p>Mr. Stroustrup is the inventor of the C++ programming language, but
quite an insightful person nevertheless.</p>

<p>Also, here is a picture of an ostrich:</p>

<p><img src="img/ostrich.png" alt="ostrich picture"></p>

</body>
</html>

A rendered version of this web page can be see here.

Elements that contain text or other tags are first opened with <tagname>, and afterwards finished with </tagname>. The html element always contains two children: head and body. The first contains information about the document, the second contains the actual document.

Most tag names are cryptic abbreviations. h1 stands for “heading 1”, the biggest kind of heading. There are also h2 to h6 for successively smaller headings. p means “paragraph”, and img stands for “image”. The img element does not contain any text or other tags, but it does have some extra information, src="img/ostrich.png" and alt="ostrich picture", which are called attributes. In this case, they contain information about the image file that should be shown here.

Because < and > have a special meaning in HTML documents, they can not be written directly in the text of the document. If you want to say “5 < 10” in an HTML document, you have to write “5 &lt; 10”, where “lt” stands for “less than”’. “&gt;” is used for “>”, and because these codes also give the ampersand character a special meaning, a plain “&” is written as “&amp;”.

Now, those are only the bare basics of HTML, but they should be enough to make it through this chapter, and later chapters that deal with HTML documents, without getting entirely confused.

5.4. CSS

CSS stands for Cascading Style Sheets. CSS is a styling language designed to describe the look and formatting (the presentation semantics) of web pages. Together with HTML and JavaScript, it makes up the third of the three languages that can be natively consumed by web browsers.

CSS syntax consists of a collection of styles or rules. Each rule is composed of a selector and a declaration block. The selector determines (selects) which HTML elements the style will apply to. The declaration block is in turm composed of a sequence of property-value pairs. The property is separated from the value by a colon (:), and property-value pairs are separated from each other by a semi-colon (;).

Here is an example of a style sheet:

body {
    margin: 60px;
    padding: 40px;
    background-color: Cornsilk;
    border: 1px solid gray;
}
h1 {
    margin-left: -20px;
    color: orange;
    font-family: Helvetica, sans-serif;
}
img {
    padding: 20px;
    border: 1px solid black;
    background-color: WhiteSmoke;
}
blockquote {
    padding: 10px;
    border: 1px dashed SaddleBrown;
}
blockquote p {
    font-style: italic;
}
blockquote p.author {
    font-style: normal;
    margin-left: 30px;
    color: DarkGoldenrod;
}
.picture {
    text-align: center;
}

Styles can be applied internally to an html document using style elements (between <style type="text/css"></style> tags) in the document header. Here is the preceding quote web page with the style included:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>A quote</title>
<style type="text/css">
body {
    margin: 60px;
    padding: 40px;
    background-color: Cornsilk;
    border: 1px solid gray;
}
h1 {
    margin-left: -20px;
    color: orange;
    font-family: Helvetica, sans-serif;
}
img {
    padding: 20px;
    border: 1px solid black;
    background-color: WhiteSmoke;
}
blockquote {
    padding: 10px;
    border: 1px dashed SaddleBrown;
}
blockquote p {
    font-style: italic;
}
blockquote p.author {
    font-style: normal;
    margin-left: 30px;
    color: DarkGoldenrod;
}
.picture {
    text-align: center;
}
</style>
</head>

<body>
<h1>A quote</h1>

<blockquote>
<p>The connection between the language in which we think/program and the
problems and solutions we can imagine is very close.  For this reason
restricting language features with the intent of eliminating programmer
errors is at best dangerous.</p>

<p class="author">-- Bjarne Stroustrup</p>
</blockquote>

<p>Mr. Stroustrup is the inventor of the C++ programming language, but
quite an insightful person nevertheless.</p>

<p>Also, here is a picture of an ostrich:</p>

<p class="picture"><img src="img/ostrich.png" alt="ostrich picture"></p>

</body>
</html>

Here is this web page rendered by your browser.

class and id attributes can be added to HTML elements for the purpose of styling them with CSS. In this example, the second parapraph element in the blockquote has been given a classed named “author” and the paragraph containing the ostrich picture has been given the class name “picture”. This example makes use of a number of Web colors.

Learning more about HTML and CSS

A working knowledge of HTML and CSS is a prerequisit for using JavaScript and jQuery for client-side web scripting. Presentation of the details is outside the scope of this book. A quick but sufficient introduction to both of these topics can be found in Getting Down with HTML and Getting Down with CSS.

5.5. Server-side scripting

Although a URL usually points at a file, it is possible for a web-server to do something more complicated than just looking up a file and sending it to the client. It can process the file in some way first, or maybe there is no file at all, but only a program that, given a URL, has some way of generating the relevant document for it.

Programs that transform or generate documents on a server are a popular way to make web-pages less static. When a file is just a file, it is always the same, but when there is a program that builds it every time it is requested, it could be made to look different for each person, based on things like whether this person has logged in or specified certain preferences. This can also make managing the content of web-pages much easier ― instead of adding a new HTML file whenever something new is put on a website, a new document is added to some central storage, and the program knows where to find it and how to show it to clients. This kind of web programming is called server-side scripting. It affects the document before it is sent to the user.

5.6. Client-side programming

In some cases, it is also practical to have a program that runs after the page has been sent, when the user is looking at it. This is called client-side scripting, because the program runs on the client computer. Client-side web scripting is what JavaScript was invented for.

The scripts are enclosed in script elements (between <script></script> tags). In addition to knowing how to render HTML, almost all current web browsers have built-in JavaScript engines that enable them to interpret JavaScript source included in script elements.

It is also possible to include JavaScript source in a separate file. Browsers load JavaScript files when they find a start <script> tag in a web page with a src attribute whose value is the URL of file containing the JavaScript code. The extension .js is usually used for files containing JavaScript code.

These files can be located on the same machine with the web page or anywhere on the web. The browser will fetch all these extra files from their servers, so it can add them to the document.

5.7. Sand-boxing

Running programs client-side has an inherent problem. You can never really know in advance what kinds of programs the page you are visiting is going to run. If it can send information from your computer to others, damage something, or infiltrate your system, surfing the web would be a rather hazardous activity.

To solve this dilemma, browsers severely limit the things a JavaScript program may do. It is not allowed to look at your files, or to modify anything not related to the web-page it came with. Isolating a programming environment like this is called sand-boxing, because a special environment, called a sandbox is created for the program to run in.

Allowing the programs enough room to be useful, and at the same time restricting them enough to prevent them from doing harm is not an easy thing to do. Every few months, some JavaScript programmer comes up with a new way to circumvent the limitations and do something harmful or privacy-invading. The people responsible for the browsers respond by modifying their programs to make this trick impossible, and all is well again ― until the next problem is discovered.

5.8. Dealing with browsers

In the early days of the web, client-side web programming was no walk in the park. It was, at times, a very painful ordeal. Why? Because programs that are supposed to run on the client computer generally have to work for all popular browsers. Different web browsers usually use different JavaScript engines, which can work slightly differently in the way they run JavaScript.

To make things worse, each JavaScript enginge contains its own unique set of problems. Do not assume that a program is bug-free just because it was made by a multi-billion dollar company. So it is up to us, the web-programmer, to rigorously test our programs, figure out what goes wrong, and find ways to work around it.

But do not let that discourage you. With the right kind of obsessive-compulsive mindset, such problems provide wonderful challenges. And for those of us who do not like wasting our time, being careful and avoiding the obscure corners of the browser’s functionality will generally prevent you from running into too much trouble.

Fortunately, things have gotten much better for web developers in recent years. At the time of this writing (2014), all the major browswers (Internet Explorer, Firefox, Google Chrome, and Safari) all support current standards pretty well. Also, JavaScript libraries like jQuery help make the task of working with multiple browsers much easier.

It is truly a good time to be a web developer!

5.9. Libraries

A module or group of modules that can be useful in more than one program is usually called a library. For many programming languages, there is a huge set of quality libraries available. This means programmers do not have to start from scratch all the time, which can make them a lot more productive.

In the early days of the web JavaScript libraries were comparatively scarce, but in recent years a number of good JavaScript libraries have emerged. Using a library is recommended: It is less work, and the code in a library has usually been tested more thoroughly than the things you wrote yourself.

Covering these basics, there are (among others) the “lightweight” libraries jQuery, MochiKit, prototype, and mootools. There are also some larger ‘frameworks’ available, which do a lot more than just provide a set of basic tools. AngularJS (by Google), YUI (by Yahoo), Dojo, and Backbone are among the more popular ones in that genre. All of these can be downloaded and used freely. Among these jQuery has emerged as something of a defacto standard. jQuery is the library we will be using in this book.

The fact that a basic toolkit is almost indispensable for any non-trivial JavaScript programs, combined with the fact that there are so many different toolkits, causes a bit of a dilemma for library writers. You either have to make your library depend on one of the toolkits, or write the basic tools yourself and include them with the library. The first option makes the library hard to use for people who are using a different toolkit, and the second option adds a lot of non-essential code to the library. This dilemma might be one of the reasons why there are relatively few good, widely used JavaScript libraries. It is possible that, in the future, new versions of ECMAScript and changes in browsers will make toolkits less necessary, and thus (partially) solve this problem.

5.10. Forms

Another popular application of JavaScript in web pages centers around forms. In case you are not quite sure what the role of “forms” is, let me give a quick summary.

A basic HTTP request is a simple request for a file. When this file is not really a passive file, but a server-side program, it can become useful to include information other than a filename in the request. For this purpose, HTTP requests are allowed to contain additional ‘parameters’. Here is an example:

http://www.google.com/search?q=aztec%20empire

After the filename (/search), the URL continues with a question mark, after which the parameters follow. This request has one parameter, called q (for “query”, presumably), whose value is aztec empire. The %20 part corresponds to a space. There are a number of characters that can not occur in these values, such as spaces, ampersands, or question marks. These are “escaped” by replacing them with a % followed by their numerical value, which serves the same purpose as the backslashes used in strings and regular expressions, but is even more unreadable.

Note

The value a character gets is decided by the ASCII standard, which assigns the numbers 0 to 127 to a set of letters and symbols used by the Latin alphabet. This standard is a precursor of the Unicode standard mentioned in Basic JavaScript: values, variables, and control flow.

JavaScript provides functions encodeURIComponent and decodeURIComponent to add these codes to strings and remove them again.

var encoded = encodeURIComponent("aztec empire");
alert(encoded);
alert(decodeURIComponent(encoded));

When a request contains more than one parameter, they are separated by ampersands, as in...:

http://www.google.com/search?q=aztec%20empire&lang=nl

A form, basically, is a way to make it easy for browser-users to create such parameterised URLs. It contains a number of fields, such as input boxes for text, checkboxes that can be “checked” and “unchecked”, or thingies that allow you to choose from a given set of values. It also usually contains a “submit” button and, invisible to the user, an “action” URL to which it should be sent. When the submit button is clicked, or enter is pressed, the information that was entered in the fields is added to this action URL as parameters, and the browser will request this URL.

Here is the HTML for a simple form

<form name="userinfo" method="get" action="info.html">
  <p>Please give us your information, so that we can send
  you spam.</p>
  <p>Name: <input type="text" name="name"/></p>
  <p>E-Mail: <input type="text" name="email"/></p>
  <p>Sex: <select name="sex">
            <option>Male</option>
            <option>Female</option>
            <option>Other</option>
          </select></p>
  <p><input name="send" type="submit" value="Send!"/></p>
</form>

The name of the form can be used to access it with JavaScript, as we shall see in a moment. The names of the fields determine the names of the HTTP parameters that are used to store their values. Sending this form might produce a URL like this:

http://planetspam.com/info.html?name=Ted&email=ted@zork.com&sex=Male

There are quite a few other tags and properties that can be used in forms, but in this book we will stick with simple ones, so that we can concentrate on JavaScript.

5.11. get and post

The method="get" property of the example form shown above indicates that this form should encode the values it is given as URL parameters, as shown before. There is an alternative method for sending parameters, which is called post. An HTTP request using the post method contains, in addition to a URL, a block of data. A form using the post method puts the values of its parameters in this data block instead of in the URL.

When sending big chunks of data, the get method will result in URLs that are a mile wide, so post is usually more convenient. But the difference between the two methods is not just a question of convenience. Traditionally, get requests are used for requests that just ask the server for some document, while post requests are used to take an action that changes something on the server. For example, getting a list of recent messages on an Internet forum would be a get request, while adding a new message would be a post request. There is a good reason why most pages follow this distinction ― programs that automatically explore the web, such as those used by search engines, will generally only make get requests. If changes to a site can be made by get requests, these well-meaning ‘crawlers’ could do all kinds of damage.

5.12. Form validation

When the browser is displaying a page containing a form, JavaScript programs can inspect and modify the values that are entered in the form’s fields. This opens up possibilities for all kinds of tricks, such as checking values before they are sent to the server, or automatically filling in certain fields.

The form shown above can be found in the file example_getinfo.html. Open it.

var form = window.open("example_getinfo.html");

When a URL does not contain a server name, is called a relative URL. Relative URLs are interpreted by the browser to refer to files on the same server as the current document. Unless they start with a slash, the path (or directory) of the current document is also retained, and the given path is appended to it.

We will be adding a validity check to the form, so that it only submits if the name field is not left empty and the e-mail field contains something that looks like a valid e-mail address. Because we no longer want the form to submit immediately when the “Send!” button is pressed, its type property has been changed from "submit" to "button", which turns it into a regular button with no effect. ― Browser Events will show a much better way of doing this, but for now, we use the naive method.


To be able to work with the newly opened window (if you closed it, re-open it first), we ‘attach’ the console to it, like this:

attach(form);

After doing this, the code run from the console will be run in the given window. To verify that we are indeed working with the correct window, we can look at the document’s location and title properties.

alert(document.location.href);
alert(document.title);

Because we have entered a new environment, previously defined variables, such as form, are no longer present.

alert(form);

To get back to our starting environment, we can use the detach function (without arguments). But first, we have to add that validation system to the form.


Every HTML tag shown in a document has a JavaScript object associated with it. These objects can be used to inspect and manipulate almost every aspect of the document. In this chapter, we will work with the objects for forms and form fields, The Document-Object Model talks about these objects in more detail.

The document object has a property named forms, which contains links to all the forms in the document, by name. Our form has a property name="userinfo", so it can be found under the property userinfo.

var userForm = document.forms.userinfo;
alert(userForm.method);
alert(userForm.action);

In this case, the properties method and action that were given to the HTML form tag are also present as properties of the JavaScript object. This is often the case, but not always: Some HTML properties are spelled differently in JavaScript, others are not present at all. The Document-Object Model will show a way to get at all properties.

The object for the form tag has a property elements, which refers to an object containing the fields of the form, by name.

var nameField = userForm.elements.name;
nameField.value = "Eugène";

Text-input objects have a value property, which can be used to read and change their content. If you look at the form window after running the above code, you’ll see that the name has been filled in.


All we have to do now is determine what happens when people click the ‘Send!’ button. At the moment, it does not do anything at all. This can be remedied by setting its onclick property.

userForm.elements.send.onclick = function() {
    alert("Click.");
};

Just like the actions given to setInterval and setTimeout (Object-oriented Programming), the value stored in an onclick (or similar) property can be either a function or a string of JavaScript code. In this case, we give it a function that opens an alert window. Try clicking it.


Another trick related to form inputs, as well as other things that can be ‘selected’, such as buttons and links, is the focus method. When you know for sure that a user will want to start typing in a certain text field as soon as he enters the page, you can have your script start by placing the cursor in it, so he won’t have to click it or select it in some other way.

userForm.elements.name.focus();

Because the form sits in another window, it may not be obvious that something was selected, depending on the browser you are using. Some pages also automatically make the cursor jump to the next field when it looks like you finished filling in one field ― for example, when you type a zip code. This should not be overdone ― it makes the page behave in a way the user does not expect. If he is used to pressing tab to move the cursor manually, or mistyped the last character and wants to remove it, such magic cursor-jumping is very annoying.


detach();

Test the validator. When you enter valid information and click the button, the form should submit. If the console was still attached to it, this will cause it to detach itself, because the page reloads and the JavaScript environment is replaced by a new one.

If you haven’t closed the form window yet, this will close it.

form.close();

5.13. Glossary

client
A program that accesses the services offered by a server.
network
A collection of computers and hardware components interconnected by communications channels enabling the sharing of information and resources.
protocol
A system of digital message formats and rules for exchanging messages in and between computing systems.
server
A program offers services to client programs.

5.14. Exercises