Perl Information Center Tutorials - File Handling
These tutorials were written to help you get a quick, but thorough, understanding of Perl -
the scope of the language as well as it's specific capabilities.
| Beginners
| Built-In Functions
| Advanced
| CGI Applications
|
|
|
|
|
|
Files
Perl offers more functions that deal with files, I/O, and directory
management than any other capability it supports. This wide variety of
file functions is supplemented by some of the most terse (does more in
fewer words - that's good!) syntax of any language, making
file I/O operations extremely easy.
|
| close, flock, open, select
|
|
| eof, getc, read, seek, tell
|
|
| chmod, unlink, rename, stat, truncate
|
|
| print, printf, write, format
|
Opening/Closing Files
In Perl a file has to be opened and given a filehandle before the
content can be accessed or before information can be written to the file.
There are three modes in which a file can be opened - read-only,
overwrite, or append, as shown in this sample code:
open(INFILE, "<input.txt") close INFILE; # read only
open(OUTFILE, ">output.txt") close OUTFILE; # overwrite
open(LOGFILE, ">>append.txt") close LOGFILE; # append
Note that for read only, the < in the qutoes is optional, so that "input.txt"
is acceptable.
Perl will close all files when the program ends, but a file can also be close
explicitly as follows:
close INFILE
Closing a file explicitly is sometimes required to reset some of Perl's special
variables, such as $. (line counter).
Reading From Files
In Perl there are three basic ways to read a file, depending on how much data
you want to read at a time.
- line - <> operator
Once a file is opened, the <> operator is used with a filehandle to read
one or more lines of text from a file, as in these examples:
$line = <INFILE>; # scalar context - reads 1 line
@lines = <INFILE>; # list context - reads all lines, \n included
@lines[0..5] = <INFILE>; # read lines into specific array elements
$/=undef;$var = <INFILE>; # read all lines as a single string
$var = do { local $/; <FILE> } # safer way to read all lines as string
The while loop is often used to walk through a file, one line at a time, with action
being taken on each line.
open(INFILE, "<input.txt");
while (<INFILE>) { # each line assigned to $_
print "Line $. : $_"; # $. special variable counts lines
}
close INFILE;
- character - getc function
With the getc function, a single character can be acquired. This is typically used
to capture keyboard input, rather than getting data from a file, but it works for either.
$var = getc(<>); # from STDIN (typically the keyboard)
$var = getc(IN); # from a file
- fixed length - read function
Some files, particularly databases, are divided into records which contain a fixed number
of characters per record (newline are not used, especially not to act as a record separator).
To capture a fixed number of characters, Perl provides the read function.
$var = read(IN, $start_position, $number_characters);
Printing to a File
There are three functions which provide the ability to write to files, each providing
different formatting capabilities.
- print
The print function allows writing of data to a file. Strings are written as defined by
the program, whereas numbers use Perl default formatting. The print function uses
with a filehandle followed by a list of variables (no comman between the two).
print OUTFILE $text # write one line of text to file
print OUTFILE @array # write array into file
print OUTFILE $var1, $var2 # write 2 variables to the file
- printf
The printf function works like the print function, except that it allows the use of a
format string for each member of the list to be printed.
- write
The write function allows the user to define multi-line formats, suitable for printing
reports, particular those which contain multiple lines using a common format. A separate
tutorial on the write function and on creating
formats is available, but here's a short example which defines a one-line report format
which prints three variables. Then the Perl program assigns values to each
variable and uses the write function to print the report to the file 'myreport.txt'
format OUTFILE =
Test: @<<<<<<<< @||||| @>>>>>
$x, $y, $z
.
$x = "dog"; $y = "cat"; $z = "pig";
open ">myreport.txt";
write OUTFILE;
OUTPUT:
Test: dog cat pig
File Position
Perl keeps track of the current position in a file, which is where the next read or write
will take place. The tell function returns that position and the seek function can be
used to reset the position to anywhere within the file.
$position = tell IN;
$new_position = seek (IN, $position, $action);
The $action tells Perl how to use $position (0 for absolute position, 1 for relative
position from current position, and 2 for relative to EOF).
File Deletion/Properties
There other functions are provided to help manage files. The unlink function will delete
a file, chmod will set the permission properties, and stat will return a 13-property
list (including file size and date) of file properties.
a file.
unlink "input.txt"; # deletes the file
chomod 755 "input.txt"; # sets permissions for the file
@properties = stat ($filename); # returns 13-property list of file properties
More on the FileHandle Operator
The <> operator is used in the examples above. Typically a filehandle (or
variable containing a filehandle) is placed in the operator, which returns
a single line when used in a scalar context and which returns all lines of
the file when used in a list context - as shown in the next two examples.
$a = <IN> returns a single line (scalar context)
@a = <IN> returns all lines (list context)
If the content of the <> operator is not a filehandle, then Perl treats the
content as a pattern to be globbed (return all filenames which fit the patter),
as in the following example above.
while ( <*.txt>) { # <> operator puts filenames matching *.txt into $)
print ; # prints each filename
}
Here are code snippets showing various ways in which the filehandle operator
may be used.
while (<>) # returns 1 line at a time
print <> # prints next line
for ( <*.txt> ) # returns filename each loop
for (`dir *.txt /b`) # returns filename each loop
for (glob '*.txt') # returns filename each loop
If the <> is empty then Perl takes one of two actions. If the special
array @ARGV (contains the command line arguments) is not empty, its contents
are assumed to be filenames, whose content is read by the <> operator.
In this next example the entire content of both files are printed as a result
of using the empty <> operator.
@ARGV = ('input1.txt','input2.txt');
while (<>) { # feed one line at time to $_
print } # print $_ by default
If the special array @ARGV is empty, then input is read from STDIN.
File Functions Reference
Here's a quick reference of the available file functions, in alphabetical order.
Unless otherwise noted, these functions operate on $_ by default.
- chmod - changes file permissions
chmod LIST
chmod (0755, 'a.txt', 'bt.txt')
- close - close an open file
close FILEHANDLE
- eof - test for end of file
eof FILEHANDLE # without FILEHANDLE, applies to last file read
- flock - lock file
flock FILEHANDLE, OPERATION # LOCK_SH, LOCK_EX, LOCK_UN, LOCK_NB
- format - declares a format for writing to a file
format Name =
FORMLIST
# comment # FORMLIST comments must have # in column 1
. # must be in column 1
- getc - get the next character from a file
getc FILEHANDLE
$result = getc $INFILE
- open - open a file
open FILEHANDLE, FILENAME
- print - output a list to a filehandle
print FILEHANDLE, LIST
- printf - output a formatted list to a filehandle
print FILEHANDLE FORMAT, LIST
- read - read specified number of characters from a file
read FILEHANDLE, SCALAR, LENGTH, OFFSET
- rename - change a filename
rename OLDNAME, NEWNAME
- seek - reposition file point for random access I/O
seek FILEHANDLE, POSITION, WHENCE
- select - reset default output or do I/O multiplexing
select FILEHANDLE
- stat - return file attributes/status
stat FILEHANDLE
stat EXPR
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks)
= stat($filename);
- tell - get current seekpointer on a filehandle
tell FILEHANDLE
- truncate - shorten a file
truncate FILEHANDLE, LENGTH
- unlink - delete a list of files
unlink LIST
unlink <*.txt>
unlink ("a.txt", "b.txt", "c.txt")
- write - write a formatted record to a FILEHANDLE
write FILEHANDLE
If you have any suggestions or correction, please let me know.
|