| scanpr |
|
Table of contents
Procedure
SCANPR ( Scanning preparation )
ENTRY SCANPR ( NMARKS, MARKS, MRKLEN, PNTERS )
Abstract
Prepare recognized markers and auxiliary arrays for the
routine SCAN.
Required_Reading
None.
Keywords
PARSING
UTILITY
Declarations
INTEGER NMARKS
CHARACTER*(*) MARKS ( * )
INTEGER MRKLEN ( * )
INTEGER PNTERS ( * )
Brief_I/O
VARIABLE I/O DESCRIPTION
-------- --- --------------------------------------------------
NMARKS I-O Number of recognizable substrings.
MARKS I-O Recognizable substrings.
MRKLEN O auxiliary array describing MARKS.
PNTERS O auxiliary array describing MARKS.
Detailed_Input
NMARKS is the number of recognized marks that will be
recognized substrings of STRING.
MARKS is an array of marks that will be recognized
by the scanning routine. Leading and trailing
blanks are not significant. (Except for the
blank character ' ', itself. After all, some
part of it must be significant.) Case of the
entries in MARKS is significant. The MARKS
'XX' and 'xx' are regarded as different MARKS.
Detailed_Output
NMARKS is the number of marks in the array MARKS after it
has been prepared for SCAN.
MARKS is an array of recognizable substrings.
It has been prepared for use by SCAN
so as to be compatible with the other arrays.
It will be sorted in ascending order, left
justified and contain no duplicate entries.
MRKLEN is an auxiliary array populated by SCANPR
for use by SCAN that describes MARKS.
PNTERS is an auxiliary array populated by SCANPR for
use by SCAN. It should be declared in the
calling program as
INTEGER PNTERS ( RCHARS )
RCHARS is given by the expression
MAX - MIN + 5
where
MAX is the maximum value of ICHAR(MARKS(I)(1:1))
over the range I = 1, NMARKS
MIN is the minimum value of ICHAR(MARKS(I)(1:1))
over the range I = 1, NMARKS
Here are some typical values that may help you avoid
going through the computations above. (This assumes
that ICHAR returns the ASCII code for a character.)
Scanning Situation RCHAR
------------------ -------------------
If NMARKS = 1
or all MARKS 5
begin with the same
character.
All MARKS begin with
one of the characters 20
in the string
' !"#$%&''()*+,-./'
All MARKS begin with
one of the characters 11
in the string
':;<=>?@'
All MARKS begin with
one of the characters 37
in the string
' !"#$%&''()*+,-./:;<=>?@'
All MARKS begin with
an upper case English letter 30
All MARKS begin with a
decimal digit 14
All Marks begin with a
lower case English letter 30
All Marks begin with
a digit or upper case 47
character.
All Marks begin with a
printing character or 100
a blank.
Anything might be a mark 132
Finally, so you won't have to look it up elsewhere
here are the ASCII codes for the printing
characters and blanks.
(Common Punctuations) Character ASCII Code
----------- ----------
' ' (space) 32
'!' 33
'"' 34
'#' 35
'$' 36
'%' 37
'&' 38
'''' 39
'(' 40
')' 41
'*' 42
'+' 43
',' 44
'-' 45
'.' 46
'/' 47
(Decimal Digits) Character ASCII Code
----------- ----------
'0' 48
'1' 49
'2' 50
'3' 51
'4' 52
'5' 53
'6' 54
'7' 55
'8' 56
'9' 57
(More punctuation) Character ASCII Code
----------- ----------
':' 58
';' 59
'<' 60
'=' 61
'>' 62
'?' 63
'@' 64
(Uppercase characters) Character ASCII Code
----------- ----------
'A' 65
'B' 66
'C' 67
'D' 68
'E' 69
'F' 70
'G' 71
'H' 72
'I' 73
'J' 74
'K' 75
'L' 76
'M' 77
'N' 78
'O' 79
'P' 80
'Q' 81
'R' 82
'S' 83
'T' 84
'U' 85
'V' 86
'W' 87
'X' 88
'Y' 89
'Z' 90
(More punctuation) Character ASCII Code
----------- ----------
'[' 91
'\' 92
']' 93
'^' 94
'_' 95
'`' 96
(Lowercase characters) Character ASCII Code
----------- ----------
'a' 97
'b' 98
'c' 99
'd' 100
'e' 101
'f' 102
'g' 103
'h' 104
'i' 105
'j' 106
'k' 107
'l' 108
'm' 109
'n' 110
'o' 111
'p' 112
'q' 113
'r' 114
's' 115
't' 116
'u' 117
'v' 118
'w' 119
'x' 120
'y' 121
'z' 122
(More punctuation) Character ASCII Code
----------- ----------
'{' 123
'|' 124
'}' 125
'~' 126
Parameters
None.
Exceptions
Error free.
1) A space is regarded as a special mark. If MARKS(I) = ' ',
then MARKS(I) will match any consecutive sequence of blanks.
2) If NMARKS is less than or equal to zero, SCAN will always
find a single token, namely the entire string to be scanned.
Files
None.
Particulars
This routine prepares the arrays MARKS, MRKLEN and PNTERS
so that they are suitable for input to the routine SCAN.
It is expected that users will need to scan many strings
and that from the programming point of view it is
easiest to simply supply a list of MARKS to a "formatting"
routine such as this so that the strings can then
be efficiently scanned by the routine SCAN. This formatting
is the function of this routine.
Examples
Suppose you need to identify all of the words within a string
and wish to ignore punctuation marks such as ' ', ',', ':', ';'
'---'. Then the first step is to load the array of marks as
shown here:
The minimum ASCII code for the first character of a marker is
32 (for ' ').
INTEGER FCHAR
PARAMETER ( FCHAR = 32 )
The maximum ASCII code for the first character of a marker is
59 (for ';').
INTEGER LCHAR
PARAMETER ( LCHAR = 59 )
The proper size to declare PNTERS is given by the parameter
RCHAR defined in terms of LCHAR and FCHAR.
INTEGER RCHAR
PARAMETER ( RCHAR = LCHAR - FCHAR + 5 )
LOGICAL FIRST
CHARACTER*(4) MARKS
INTEGER NMARKS ( 5 )
INTEGER MRKLEN ( 5 )
INTEGER PNTERS ( RCHAR )
SAVE FIRST
SAVE MARKS
SAVE MRKLEN
SAVE PNTERS
IF ( FIRST ) THEN
FIRST = .FALSE.
MARKS(1) = ' '
MARKS(2) = '---'
MARKS(3) = ':'
MARKS(4) = ','
MARKS(5) = ';'
NMARKS = 5
CALL SCANPR ( NMARKS, MARKS, MRKLEN, PNTERS )
END IF
Notice that the call to SCANPR is nested inside an
IF ( FIRST ) THEN ... END IF block. In this and many applications
the marks that will used in the scan are fixed. Since the marks
are not changing, you need to process MARKS and set up
the auxiliary arrays MRKLEN and PNTERS only once (assuming that
you SAVE the appropriate variables as has been done above).
In this way if the code is executed many times, there is only
a small overhead required for preparing the data so that it
can be used efficiently in scanning.
Restrictions
1) MRKLEN and PNTERS must be declared to be at least as large
as indicated above. If not, this routine will write
past the ends of these arrays. Much unpleasantness may
ensue in the attempt to debug such problems.
Literature_References
None.
Author_and_Institution
J. Diaz del Rio (ODC Space)
W.L. Taber (JPL)
Version
SPICELIB Version 1.1.0, 26-OCT-2021 (JDR)
Added IMPLICIT NONE statement.
Edited the header to comply with NAIF standard.
SPICELIB Version 1.0.0, 26-JUL-1996 (WLT)
|
Fri Dec 31 18:36:45 2021