Beginning
Overview
IDE
Projects
Forms
Controls
Intrinsic Ctrls
ActiveX Ctrls
Code
Commands
Cmd Reference
File Handling
Printing
Distribution

Intermediate
Menus
Dialogs
Mouse
Drag and Drop
Graphics
Error Handling
dBase Overview
dBase Controls
SQL

Advanced
API
Objects
Internet File Xfr
UNIX/CGI/Perl
Perl Commands
JavaScript
Web Page Forms

GBIC >> VB >> Tutorials >> UNIX/CGI/Perl
UNIX/CGI/Perl
It's very common for web sites to generate web pages on the fly, accessing information in a server database to be used to create and display HTML pages on the fly - dynamic HTML. The pages are created by a program that runs on the server. Since VB does not work in UNIX, and since most servers run UNIX, VB programmers are out of luck. Fortunately, UNIX does support the very popular freeware language "Perl", which I call the "QBasic of UNIX". Perl is an interpreted language, like QBasic, with which you can write programs to run on UNIX servers and which you can call from your web pages. Perl is functionally similar to QBasic and the basics of Perl are fairly easy to learn. This part of the tutorial briefly covers UNIX and Perl.

UNIX Overview UNIX Commands CGI Overview
Perl Overview Perl Language


Return to top of document

UNIX Overview

Fact is, most web servers use the UNIX operating system. If your ISP operates from a UNIX machine, then you cannot use Microsoft Windows/NT products on the server. Either you find a new ISP who offers an NT server, or it's time to pick up a few new programming skills.

One quick comment - Perl has been ported to the PC. I develop my scripts in Win98, then transfer them to my UNIX server. You do not have to have UNIX locally to develop your scripts. I also test out the scripts on the UNIX environment using the Telnet utility that comes with Win98 (remote access to the UNIX server from my PC).

When UNIX is running on a computer it presents the user with a command line (just like DOS) from which commands may be typed in for immediate execution. Just like DOS, you can type in single commands, create batch files, or initiate programs. If you prefer the graphical interface, there is also a program called XWindows which provides a similar interface to a UNIX machine that Microsoft Windows/NT gives to PCs.

While many of you might have a server (Unix machine) of your own, most of your reading this tutorial will be working with a remote server - one that your ISP maintains for you. Fortunately, you can access the remote server (where your web site is located) using a variety of software applications. A very common application called Telnet comes free with Win9x/NT. Telnet works from within a DOS window. Just type in "telnet" at the DOS prompt.

When accessing a server remotely, you would use Telnet to creates a connection to the server. Once connected, Telnet presents you with a a window on your home PC which lets you interact with the server just as though you were sitting right in front of it. You will notice a delay between your entries and the response of the UNIX server due to the phone/modem connection, but otherwise everything is done in real time.

The UNIX operating system is designed to allow multiple people to log on at the same time, each operating in their own virtual computer space - unable to see or affect the other users that are also logged on to the same machine.

Just like with PC programs that you're familiar with, the output of a program running on the UNIX server can be sent to the screen or to files (or to printers if your ISP allows it, but most don't for security reasons). Input is made via the mouse or keyboard (Telnet supports both). In UNIX terminology the screen is referred to as STDOUT and the keyboard is referred to as STDIN (just like in DOS, although the terminology is not used as much in DOS).

In the next section, we'll talk about basic UNIX commands which you can run from the Telnet prompt.


Return to top of document

UNIX Commands

Despite its differences, a UNIX machine accepts commands that are essentially the same as those that a PC DOS machine will accept. Mostly, the commands have to do with file and directory manipulations, as well as with basic text manipulations.

However, whereas DOS uses only the COMMAND.COM to supply the familiar C: prompt, there are several common equivalents in UNIX to COMMAND.COM. There are three of these "shells" (Bourne, C, and Korn), as they are callled, that are most often used. The shells are generally similar to one another, although there are interface and command differences.

Which one you have on your server depends on what your ISP chose. Generally, the choice of a shell is not critical because you will be writing your web server programs in Perl or C rather than the batch language that is specific to the particular shell that you use to interface to the UNIX operating system. I'll have more to say on this in the CGI section.

Below are the most common UNIX commands that you might use during a Telnet session. Remember that these are UNIX commands - not Telnet commands. Telnet is just a program that connects you to the UNIX server for remote access. You'll see that these commands are mostly associated with manipulating files on the server.

While you can use UNIX commands to edit/delete/move files directly on the server, it is usually more convenient to manipulate the files on your PC and then copy them over to the UNIX server. Passwords are almost always required to establish a Telnet connection to protect the server contents from malicious acts from unauthorized visitors.

One thing to note - UNIX is case-sensitive! If you're following examples on how to use UNIX be sure to follow the case of the examples.

Common UNIX commands:

  • clear: clear screen
  • cd: change directory
  • ls: list directory content
  • pwd: print working directory (the current directory)
  • mkdir: make directory
  • rmdir: remove directory
  • cp: copy a file
  • mv: move a file
  • rm: remove a file
  • man: manual (lookup documentation on a command)
  • whatis: short description of a command
  • cat: concatenation (display the contents of a file on the screen)
  • more: more (view a file a page at a time)
  • vi: vi text editor
  • pico: pico text editor

UNIX also supports redirection (the <, > and >> commands and uses the . and .. notation to describe the current and parent directories.

Finally, there is the issue of file permissions. On a file by file basis, UNIX allows 3 levels of security in regards to reading and writing to files as well as to executing program files. In general, web servers are set up such that all executable files must be placed in a directory called CGI-BIN. While not a requirement of UNIX, it is a practice followed to prevent uncontrolled use of programs.

From a programmer's viewpoint the key is that programs must all be located in a particular directory (but not data files) and the execution of those programs can be controlled/limited in a variety of ways. This is critical in that web servers would otherwise be susceptible to hackers who would plant programs of their own on the server - programs which could do uncontrolled damage to the files on the server. During your Telnet sessions you can set the security for your executable files. Typically, a "chmod 755 filename" command is used to set a file's security. You can get more information on the chmod command by using the UNIX "man" statement.


Return to top of document

CGI Scripts

This brings us to the concept of CGI - Common Gateway Interface. When a web browser sends a request to a web server the request may include a simple request to return a web page. Often, satisfying the request may require that the server execute a program (contained in the CGI-BIN directory) which creates the web page that is to be returned.

You may have heard of the phrase "CGI scripts" but you need to be aware that CGI is a specification for how the server and the executable program exchange data. CGI is not a computer language and there is no such thing as a CGI script - just programs which comply with the CGI specification. Even so, the phrase CGI scripts is commonly used and simply means any computer program which can be used to return data to a web server.

Here's the important part - any language (C, VB, Java, Perl, QBasic, VBScript, or even batch commands of the OS shell) can be used as a CGI script, provided that the language will run on the server.

For us VB programmers the downside is that VB, QBasic and VBScript have not been ported to UNIX. They work only on Microsoft servers. And, since most servers are UNIX-based, it's pretty likely that most of you will not be able to use your VB/QBasic/VBScript knowledge to create interactive web sites.

That's the case for me. My ISP uses a UNIX server running the Apache web server. I could probably find an ISP that supports an NT server but that means I'd have to change ISPs - something I don't have the time nor inclination to do.

So, my choices are basically down to two language - C or Perl, both of which are available for use on UNIX servers. I've chosen Perl for reasons that I list below.


Return to top of document

Perl Overview

As a long timer VB programmer I never know much about Perl. Now, I've found that it is the de facto standard for writing CGI scripts. It has taken that over that role because of a few basic reasons:

  • It is interpreted, just like QBasic, thus allowing for rapid evaluation of scripts (programs) without going through the compile stage.
  • It is free, just like QBasic. Unlike QBasic, Perl is in a continuing phase of development.
  • It has very powerful text handling capabilities. Not that Perl can do anything I can't write in VB or QBasic, but the built-in Perl commands can do things in 1 or 2 lines that VB might takes tens of lines to complete! The downside is that Perl can be very cryptic looking. It was not written with English in mind !

In the few weeks since I decided to incorporate dynamic page generation on my web site (i.e., creating web pages on demand from a database on my web site server) I've been able to understand Perl well enough to write the Perl programs that do what I needed.

Perl is simple enough, and close enough to QBasic, that learning the basics of Perl was pretty simple. There are many web site tutorials which provided all the instruction I needed to write simple Perl programs. However, don't be misled. Perl is too complex a language to be learned fully in a few short weeks. Becoming an expert with Perl will take much longer but I've found that even as a fairly new beginner you can write productive CGI scripts!

Without further ado, let's look at a very simple Perl program:

   #!/usr/bin/perl
   $name="Hello World"
   print $name;
Except for the header line, this program looks amazingly like a QBasic program, doesn't it?

The first line is a standard header for UNIX scripts (each language has its own header, telling UNIX where to find the executable) and simply tells UNIX where to find the Perl executable.

Line 2 gives a value to a string variable and Line 3 prints the string variable. Perl can really be this simple!

Here's a very important point about this script, one that applies to all Perl programs. The Perl print statement goes, by default, to the screen. If you ran this program in a Telnet session, "Hello World" would be displayed on your screen.

If you called this CGI (Perl) script using a web page (I'll explain how later), you would get a web page back with those same words! Here's the explanation of how it works:

  • The web browser sends a request to the web server to run the CGI script
  • The web server receives the request and tells UNIX to run the script
  • UNIX executes the script and redirects the output of the script to the web server - i.e., the STDOUT is changed from the screen to a data stream that goes to the web server
  • The web server sends the redirected output to the web browser
  • The web browser displays the results

This is pretty much how all web servers work. A link on a web page can ask the web server for an existing web page or it can ask the web server to run a CGI script which in turn will generate a web page on the fly! All of the big commercial web sites work on this basis.

I've not yet explained one important piece of information - that the output of the CGI script (the Perl program) must be in a format such that the web browser knows what to do with it. For example, consider this simple HTML page:

   <HTML>
   <Head>
   </Head>
   <Body>
   <H1>Hello World!</H1>
   </Body>
   </HTML>
The sample Perl program I showed above would actually have to print all of these lines to be compliant with the specifications for an HTML documents. In practice a browser can get by on much less, but creating the extra HTML tags is not difficult and in all of my Perl scripts I include the full set of HTML tags in my output.

In case you haven't realized it by now, creating dynamic web pages means you not only have to learn a new language such as Perl but you have to understand HTML programming as well. This shouldn't be a problem for most web masters because learning HTML was a pre-requisite for creating our sites. The only new thing to learn is how to use Perl to create the web page content.

I do not cover HTML programming in this tutorial but you will have to master it before you can be proficient at creating dynamic web pages. There are many online sources for learning the skills. I suggest the Web Developer's Virtual Library. It has a wide variety of tutorials covering various aspects of web site programming.


Return to top of document

Perl Language

And finally we come to the fun part - coding! The intent with this section is to provide you with a quick overview of the basic commands which Perl provides, including some short snippets of code to show you how they work. With study you should be able to use this tutorial to write some programs of your own, but you will want to supplement this section by reading some of the other online tutorials about Perl. At the end of this section I provide a sample Perl program which shows how the following commands can be tied together to create a useful program.

Syntax comments

  • Perl is case sensitive. $Foo and $FOO are different variable names.
  • Variables start with $. $string = "Gary Beene"
  • Arrays start with @. @string = ("one", "two", 3, 4, "dog")
  • Arrays can contain mixed types of data
  • Associative arrays are supported. These are pairs of values, where a value is retreived by using the first value of the pair as the index.
  • Associative arrays start with %. %alldata = {"one", 1, "two", 2}
  • Perl statements end with ;. print "Hello";
  • Subroutines start with &.
  • Notice the different brackets for scalar and associative arrays
  • Comment lines begin with #.

Operations

  • ++$a will increment a$ by 1
  • --$a will decrement a$ by 1
  • $a = 5 ** 10 gives 5 raised to the power of 10
  • $a = $b . $c concatenates b$ and c$
  • $a = $b x 5 makes $a become b$ repeated 5 times
  • $a += $b adds $b to $a
  • $a -= $b subsracts $b from $a
  • $a .= $b appends $b onto $a (see string printing below for an alternative)
Arrays
  • @data = (1, 2, "three") puts 3 values into the array @data
  • $data[2] refers to the third (0,1,2) position of the array
  • push (@data, "dog") adds the value "dog" to the end of the array
  • pop (@data) removes the last value of the array
  • ($a, $b) = ($c, $d) is same as $a=$c and $b=$d
  • ($a, $b) = @data assigns first two values of array @data to $a and $b
  • $#data gives the largest index value of the array @data
  • $a = @data assigns the length of the array to $a
  • $a = "@data" assigns the entire list of elements of @data to $a
File Handling
  • open (INFO,"filename") to open a file for input using the handle INFO
  • @lines = <INFO> to read the entire file into the array @lines
  • close(INFO) to close the file
  • open (INFO, >"filename") for ouput
  • open (INFO, >>"filename") to append
  • open (INFO, <<"filename") for ouput (the "&lgt;" is optional)
String printing
  • print '$Hello' (single quotes) prints the six characters: $Hello
  • print "$Hello" (double quotes) prints the value of a variable called $Hello
  • print "Hi there, $myname, how are you?" inserts the value of the variable $myname into the string
  • print @lines to print the entire array (one long string unless the string contains CRLF characters)
  • print INFO "Hello" prints the word "Hello" to the file opened as INFO
  • "\n" is a newline (carriage return / line feed)
  • "\t" is a TAB
  • print <<XXX on one line prints everything as-is until a line that starts in XXX (will recognize embedded variables)
Boolean
  • scalar is TRUE if not a null string
  • scalar is TRUE is not zero
  • $a==$b tests if $a is numerically equal to $b
  • $a!=$b tests if $a is not numerically equal to $b
  • $a eq $b tests if $a is string-equal to $b
  • $a!=$b tests if $a is not string-equal to $b
  • ($a && $b) tests if $a AND b$ is true
  • ($a || $b) tests if $a OR b$ is true
  • !($a) tests if $a is false
Control Structures
  • foreach - walks through an array
           @food = ("apple", "pear", "peach)
           foreach $morsel (@food)
           {
              print $morsel;
           }
    
  • for - executes a block of statements while an expression is True
           for ($i=1; $i<10; ++$i)
           {
              print "$i\n";
           }
    
  • while or until - executes a block of statements until an expression is true/false
           while ($a ne "stop")
           {
              $a = ;
           }
    
           --- or ---
    
           while ($line = )
           {
              $a = ;
           }
    
           --- or ---
    
           do 
           {
              $a = ;
           }
           while ($a ne "stop")
    
    
  • if - executes a block once if an expression is true
           if ($a)
           {
              print "True";
           }
    
           --- or ---
    
           if ($a)
           {
              print '$a is True';
           }
           elsif (b$)
           {
              print '$b is True';
           }
           else 
           {
              print 'none were True';
           }
    
Matching / Substitution
  • /xxx/ is a string between slashes and is called a Regular Expression
  • $string =~ /the/ is True if "the" is in the variable $string
  • $string !~ /the/ is True if "the" is NOT in the variable $string
  • Special characters between the slashes affect how the matching is tested
  • $string =~ /^x/ tests for x at the start of the string
  • $string =~ /$x/ tests for x at the end of the string
  • $string =~ /./ tests for any single character
  • $string =~ /t.e/ tests for t and e separated by any one character
  • $string =~ /^$/ tests for a string with nothing in it
  • $string =~ /[a-z]/ test for any one character of any lower case letter
  • $string =~ /[a-zA-Z]/ test for any one character of any letter
  • $string =~ s/dog/cat/ replaces dog with cat first time it appears in the string
  • $string =~ s/dog/cat/gi replaces dog with cat anywhere in the string, case insensitive
Sample Perl Program
This section on Perl contained some pretty terse information, but an example will help show how the commands can be fit together to create useful scripts. Here's some actual code that I use at my web site. It receives a search string from a web browser and compares each record in a database to see if it should be printed out.

In this example I've left off the top and bottom parts of my script, which are where I've put the HTML print statements. Those parts were very long and I just wanted to show the heart of the Perl routine. In this example, I have received a string called $value which represents a search string to find in my data base, a text file called "sites.txt".

        open(INFO, "sites.txt");
        @lines = <INFO>;
        foreach $record (@lines) {
          @elements = split(/;;;/, $record);
          if ($value eq 'all') {
            print "<tr><td><a href=\"$elements[0]\"_
            ><b>$elements[1]</b></a><_
            td align=center>$elements[6]<td>...$elements[8]" ;
          }
          else {
            if ($elements[5] eq $value) {
               print "<tr><td><a href=\"$elements[0]\"_
               ><b>$elements[1]</b></a><_
               td align=center>$elements[6]<td>...$elements[8]" ;
               }
          }
        }
        close (INFO);

In this example, I open the file and read it all into an array called @lines, with one record per element in the array. Then I split up each record into an array called @elements (I used ;;; to separate each field in my database). I then tested positions 0 and 5 of the array to see if they match the incoming search string. In either case, if the match is True the information is printed out.

Even from this short script you can see the HTML tags that I use to format the printed values so that a web browser will interpret the output of my script as an HTML file.

Last, but not Least
I sort of snuck in the comment that my CGI script received a variable $value which had in it the search string supplied by the web server. This needs a bit more explanation. When a web browser sends a request to the web server to execute a CGI script, the request contains quite a bit of information. Of interest to a CGI script writer is that the additional information sent by the web browser is passed to the CGI script. Some of the data is easily accessible by accessing UNIX Environmental variables, which are set by the web server. However, some of the information sent by the web browser is included in a simple encoded (specially formatted) string which the CGI script must decode.

There are many sites on the web which provide free Perl scripts, including the code to extract the information from the encoded string. I didn't show it in this tutorial, but you will have to locate that code and include it in your Perl scripts. It's a little more complicated than I've indicated but is well within the capabilities of a beginning Perl programmer.