Search the Catalog
Web Client Programming with Perl

Web Client Programming with Perl

Automating Tasks on the Web

By Clinton Wong
1st Edition March 1997




This book is out of print, but it has been made available online through the O'Reilly Open Books Project.


Index

[ Symbols ], [ Numbers ], [ A ], [ B ], [ C ], [ D ], [ E ], [ F ], [ G ], [ H ], [ I ], [ K ], [ L ], [ M ], [ N ], [ O ], [ P ], [ Q ], [ R ], [ S ], [ T ], [ U ], [ V ], [ W ], [ X ],

Symbols[ Top ]
& (ampersand), 20, 37
= (equal sign), 37
% (percent sign), 37
+ (plus sign), 20

Numbers[ Top ]
100 range HTTP status codes, 47, 101
200 range HTTP status codes, 48, 101
300 range HTTP status codes, 48, 101
400 range HTTP status codes, 49-51, 102
500 range HTTP status codes, 51, 102

A[ Top ]
abs( ), 111
absolute URLs, 91, 111
accept( ), 66-68, 70
Accept header, 56, 177
Accept-Charset header, 177
Accept-Encoding header, 177
Accept-Language header, 178
Accept-Ranges header, 59, 181
add_content( ), 100, 102
Age header, 182
agent( ), 96
Allow header, 184
ampersand (&), 20, 37
application/x-www-form-urlencoded media type, 19, 35-37, 193-194
as_string( ), 98, 100, 103, 113
authorization/authentication, 44, 85
      Authorization header, 62-63, 178
      digest authentication, 46, 185
      LWP functions for, 96
      Proxy-Authenticate header, 182
      Proxy-Authorization header, 181
      WWW-Authenticate header, 62, 184

B[ Top ]
base( ), 102, 111
BASIC authorization, 62, 178
bind( ), 66-68
body, response (see entity-body)
BottomMargin attribute, 109
browsers, vii, 3
bugs, 6
byte ranges, 45, 59

C[ Top ]
Cache-Control header, 57, 172
caching, 57-59, 172
CGI programs, 17-20
      HTTP codes for errors in, 51, 102
character encoding (see encoding)
character sets, 201
      Accept-Language header, 178
      Content-Language header, 185
CheckSite package (example), 131-141
checksums, 46
classes, LWP (see modules, LWP)
client requests, 10, 23-24
      cache directives, 172
      HTTP codes for errors in, 49-51
      HTTP module for (LWP), 98-101
      request header, 11, 24, 53, 177-181
      request methods, 19, 24, 31-41
      robots for (examples), 125-131
      timeouts, 96
      UserAgent module (LWP), 95-97
clients (see web clients)
close( ), 66, 69
code (see source code)
code( ), 101
connect( ), 66-68
Connection header, 46, 55, 173
content( ), 100, 102
Content-Base header, 184
Content-Encoding header, 184
Content-Language header, 185
Content-Length header, 19, 59, 185
Content-Location header, 185
Content-MD5 header, 185
Content-Range header, 60, 185
Content-Transfer-Encoding header, 186
Content-Type header, 19, 40, 56, 186
Cookie header, 63, 179
cookies, 63, 179, 182
CPAN archives, 87
crack( ), 111
credentials( ), 95
current_age( ), 103

D[ Top ]
date (see time and date)
Date header, 174
Date module (LWP), 105
daytime server, 72
default_port( ), 112
delay( ), 98
delete( ), 108
DELETE method, 40
dictionary client (example), 145-154
digest authentication, 46, 185
directives, caching, 172
document path, 11
documentation, HTTP, 46
documents (see files/documents)

E[ Top ]
Element module (LWP), 90, 107
encoding, 193-194
      Accept-Encoding header, 177
      Content-Encoding header, 184
      Content-Transfer-Encoding header, 186
      Content-Type header, 19, 40, 56, 186, 193
      Transfer-Encoding header, 175
encoding URLs (see URL-encoded format)
entity-body, 12
      storing at URL, 38-40
entity headers, 24, 54, 184-187
entity tags, 44, 58
env_proxy( ), 97
eparams( ), 112
epath( ), 112
eq( ), 113
equal sign (=), 37
equery( ), 112
error_as_HTML( ), 102
errors, HTTP status codes for, 49-51, 102
Escape module (LWP), 110
ETag header, 59, 186
expanding relative URLs, 91
Expires header, 58, 186
extract_links( ), 90, 108
extracting links from files, 80-84, 90, 121-124

F[ Top ]
fedex program (example), 125-131
filehandles (see sockets)
files/documents
      caching, 57-59, 172
      extracting links from, 80-84, 90, 121-124
      publishing on web servers, 20-23
      referring, 60, 181
      retrieving, 31, 88
            with telnet, 16
            (see also GET method)
      storing at URLs, 38-40
      uploading with POST method, 37
FontFamily attribute, 109
FontScale attribute, 109
format( ), 108
FormatPS module (LWP), 108
FormatText module (LWP), 108
forms, HTML, 17-20, 36
frag( ), 113
fresh_until( ), 103
freshness_lifetime( ), 103
from( ), 96
From header, 179
FTP, obtaining examples by, x
full_path( ), 113

G[ Top ]
gateway systems (see proxy servers)
general headers, 24, 52, 171-176
get( ), 88, 94
get_basic_credentials( ), 96
GET method, 29, 31
Getopts( ), 76
getprint( ), 88, 94
getstore( ), 94
graphical browsers (see browsers)
Graphical User Interface (GUI), 143
graphical user interface (see Tk extension)
graphics (see images)

H[ Top ]
hcat program (example), 76-80, 118-121
head( ), 94
HEAD method, 33
header( ), 100, 102, 104
headers, 52-64, 171-189
      entity headers, 24, 54, 184-187
      general headers, 24, 52, 171-176
      Headers module for (LWP), 103
      identification headers, 61
      retrieving, 33
      (see also under specific header name)
hgrepurl program (example), 81-84, 121-124
HorizontalMargin attribute, 109
host( ), 112
Host header, 11, 179
host_wait( ), 98
hostnames, 11, 179
      multihoming, 44
HTML (Hypertext Markup Language), 13-15
      converting to PostScript, 109
      documents (see files/documents)
      error explanations in, 102
      forms, 17-20, 36
      HTML module for (LWP), 89, 106-109
      parsing, 89
      tag parameters, 84
HTTP (Hypertext Transfer Protocol), 1, 4, 23-25, 28-30
      headers (see headers)
      HTTP module for (LWP), 98-106
      requests (see client requests)
      responses (see server responses)
      status codes, 30, 47-52, 101, 104
      versions of, 29, 41-47, 86, 187
hyperlinks (see URLs)

I[ Top ]
IANA (Internet Assigned Number Authority), 191
identification headers, 61
If-Match header, 59, 180
If-Modified-Since header, 57, 179
If-None-Match header, 59, 180
If-Range header, 60, 180
If-Unmodified-Since header, 58, 180
IGNORE_TEXT flag, 107
IGNORE_UNKNOWN flag, 107
images, 13
IMPLICIT_TAGS flag, 107
informational HTTP status codes, 47, 101
initializing sockets, 68
Internet Assigned Number Authority (IANA), 191
Internet media types (see media types)
is_client_error( ), 104
is_error( ), 95, 102, 104
is_fresh( ), 103
is_info( ), 101, 104
is_protocol_supported( ), 96
is_redirect( ), 101
is_server_error( ), 104
is_success( ), 95, 101, 104

K[ Top ]
keep-alive connections, 46, 55, 173

L[ Top ]
languages, 195-200
      Accept-Language header, 178
      Content-Language header, 185
Last-Modified header, 58, 187
Leading attribute, 109
LeftMargin attribute, 109
listen( ), 70
Location header, 85, 187
LWP library, 5, 65, 87-116
      modules of, 92-113
      periodic clients (examples), 125-131
      recursive clients (examples), 131-141
      simple clients (examples), 118-124
LWP module, 88, 93-98

M[ Top ]
Max-Forwards header, 41, 180
media types, 55, 186, 191-193
      MIME-Version header, 174
      (see also encoding)
message( ), 102
metainformation, 44, 52, 85
method( ), 100
methods (see request methods)
MIME types (see media types)
MIME-Version header, 174
mirror( ), 94, 96
mnemonics, Status module (LWP), 105
modification time, 12, 57, 187
modules, LWP, 92-113
Mosaic browser, 3
multihoming, 44

N[ Top ]
netloc( ), 111
Netscape Navigator, 3
      cookies, 63, 179, 182
no_proxy( ), 97
no_visits( ), 98

O[ Top ]
obtaining (see retrieving)
options, 6, 41
OPTIONS method, 41

P[ Top ]
pack( ), 68
package delivery programs, 125-131, 154-162
PageNo attribute, 109
PaperHeight attribute, 109
PaperSize attribute, 109
PaperWidth attribute, 109
params( ), 112
parse_html( ), 90, 107, 146
parse_htmlfile( ), 107
parsing
      HTML, 13-15, 89
      Parse module for (LWP), 107
      URLs, 10, 74
password( ), 112
path( ), 112
paths, document, 11
percent sign (%), 37
periodic clients (examples), 125-131
Perl language
      LWP library, 5, 65, 87-116
      sockets library, 65
      Tk (see Tk extension)
persistent connections, 46, 55, 173
persistent-state cookies, 63, 179, 182
pinging servers, program for, 162-169
pl2bat program, 89
plus sign (+), 20
port( ), 112
POST method, 19, 34-38
PostScript, converting HTML into, 109
Pragma header, 57, 175
print command, 69
proxy( ), 97
proxy servers, 45, 57-59, 97, 115
      caching and, 57-59, 172
      Pragma header, 57, 175
      TRACE method, 41-42
Proxy-Authenticate header, 182
Proxy-Authorization header, 181
Public header, 182
publishing on web servers, 20-23
push_header( ), 104
PUT method, 38-40

Q[ Top ]
query( ), 113

R[ Top ]
Range header, 45, 60, 181
reading from network connection, 69
recursive clients (examples), 131-141
redirection, 85
      HTTP status codes for, 48, 101
Referer header, 60, 181
rel( ), 111
relative URLs, 91, 111
remove_header( ), 104
request( ), 95
request header, 11, 24, 29, 177-181
request methods, 19, 24, 31-41
Request module (LWP), 98-101
requests (see client requests)
response header, 12, 25, 30, 54, 85, 181-184
Response module (LWP), 98, 101-103
responses (see server responses)
retrieving
      example code, x
      files/documents, 31, 88
      headers, 33
      LWP library, 87
      with telnet, 16
Retry-After header, 182
RightMargin attribute, 109
robots
      periodic clients (examples), 125-131
      Robot Exclusion Standard, 7, 203-205
      RobotUA module for (LWP), 97, 115
robots.txt file, 204-205
rules( ), 98

S[ Top ]
saving (see caching)
scheme( ), 111
security (see authorization/authentication)
Server header, 62, 182
server responses, 11, 24
      cache directives, 172
      response header, 12, 25, 30, 54, 85, 181-184
      Response module for (LWP), 101-103
      response time, 86
      status codes (see status codes, HTTP)
servers (see web servers)
Set-Cookie header, 63, 182
shcat program (example), 79
simple clients (examples), 118-124
Simple module (LWP), 88, 93
socket( ), 66
socket library, 65
sockets, 65-72
      connecting client and server, 71-74
      socket calls, 66-71
source code
      example, obtaining, x
      testing, 6
space character, 37
specifications, HTTP, 46
status codes, HTTP, 30, 47-52, 101, 104
Status module for (LWP), 104
str2time( ), 106
strict( ), 111
sysread( ), syswrite( ), 66-68, 69

T[ Top ]
tag parameters, 84
TCP/IP, 5
telnet client, 16
testing source code, 6
text, converting HTML to, 108
time and date
      Age header, 182
      Date header, 174
      Date module (LWP), 105
      modification time, 12, 57, 187
      request timeouts, 96
      server response time, 86
time2str( ), 105
timeout( ), 96
Tk extension, 143-145
      dictionary client example, 145-154
      package tracking client example, 154-162
      pinging servers client example, 162-169
TopMargin attribute, 109
TRACE method, 41-42
tracking packages, example programs for, 125-131, 154-162
Transfer-Encoding header, 175
traverse( ), 146
Treebuilder module (LWP), 90

U[ Top ]
Upgrade header, 176
uploading files, 37
uri_escape( ), uri_unescape( ), 110
URI header, 187
URI module (LWP), 91, 110-113
url( ), 100
URL-encoded format, 19, 35-37, 193-194
URLs (uniform resource locators), 3
      deleting, 40
      extracting links from files, 80-84, 90, 121-124
      following with recursive clients, 131-141
      hyperlinks, 15
      options available for, 41
      parsing, 10, 74
      redirection HTTP status codes, 48, 101
      relative, expanding, 91
      storing entity-bodies at, 38-40
      URL module for (LWP), 91, 111
use_alarm( ), 96
user( ), 112
User-Agent header, 61, 181
UserAgent module (LWP), 95-97, 113-116

V[ Top ]
Vary header, 183
versions, HTTP, 29, 41-47, 86, 187
VerticalMargin attribute, 109
Via header, 41

W[ Top ]
WARN flag, 107
Warning header, 183
web clients, 1, 4, 6
      caching, 57-59, 172
      connecting to server, 71-74
      cookies, 63, 179, 182
      design considerations, 84
      examples
            CheckSite, 131-141
            hcat, 76-80, 118-121
            hgrepurl, 81-84, 121-124
            package tracking, 125-131, 154-162
            periodic clients, 125-131
            recursive clients, 131-141
            shcat, 79
            simple clients, 118-124
            webping, 162-169
            xword, 145-154
      identification headers for, 61
      requests (see client requests)
      sockets and (see sockets)
      tracing messages from, 41-42
web servers, 4, 6
      checking if up (example), 162-169
      connecting clients to, 71-74
      HTTP error codes for, 51, 102
      proxy servers, 41-42, 45, 57-59, 97, 115
      publishing documents on, 20-23
      responses (see server responses)
      sending data to, 34-38
      sockets and (see sockets)
      uploading files to, 37
      when down, 85
webping program (example), 162-169
widgets (see Tk extension)
Windows 95, 5
Windows NT, 5
World Wide Web, 2-4
      browsers (see browsers)
writing
      to network connection, 69
      web clients, 84
WWW-Authenticate header, 44, 62, 184

X[ Top ]
xword program (example), 145-154

Back to: Chapter Index

Back to: Web Client Programming with Perl


O'Reilly Home | O'Reilly Bookstores | How to Order | O'Reilly Contacts
International | About O'Reilly | Affiliated Companies

© 2001, O'Reilly & Associates, Inc.
[email protected]