| scan |
|
Table of contents
Procedure
SCAN ( Scan a string for tokens )
ENTRY SCAN ( STRING,
. MARKS, MRKLEN, PNTERS, ROOM, START,
. NTOKNS, IDENT, BEG, END )
Abstract
Scan a string and return the beginnings and ends of recognized
and unrecognized substrings. The full collection of these
substrings partitions the string.
Required_Reading
None.
Keywords
PARSING
Declarations
CHARACTER*(*) STRING
CHARACTER*(*) MARKS ( * )
INTEGER MRKLEN ( * )
INTEGER PNTERS ( * )
INTEGER ROOM
INTEGER START
INTEGER NTOKNS
INTEGER BEG ( * )
INTEGER END ( * )
INTEGER IDENT ( * )
Brief_I/O
VARIABLE I/O DESCRIPTION
-------- --- --------------------------------------------------
STRING I string to be scanned.
MARKS I recognizable substrings.
MRKLEN I an auxiliary array describing MARKS.
PNTERS I an auxiliary array describing MARKS.
ROOM I space available for storing substring descriptions.
START I-O position from which to begin/resume scanning.
NTOKNS O number of scanned substrings.
BEG O beginnings of scanned substrings.
END O endings of scanned substrings.
IDENT O position of scanned substring within array MARKS.
Detailed_Input
STRING is any character string that is to be scanned
to locate recognized and unrecognized substrings.
MARKS is an array of marks that will be recognized
by the scanning routine. This array must be prepared
by calling the routine SCANPR.
Note that the blank string is interpreted
in a special way by SCAN. If the blank character,
' ', is one of the MARKS, it will match any unbroken
sequence of blanks in string. Thus if ' ' is the only
marks supplied and STRING is
'A lot of space '
......................
Then scan will locate the following substrings
'A' STRING(1:1) (unrecognized)
' ' STRING(2:4) (recognized --- all blanks)
'lot' STRING(5:7) (unrecognized)
' ' STRING(8:8) (recognized --- a blank)
'of' STRING(9:10) (unrecognized)
' ' STRING(11:16) (recognized --- all blanks)
'space' STRING(17:21) (unrecognized)
' ' STRING(22:22) (recognized --- a blank)
MRKLEN is an auxiliary array populated by SCANPR
for use by SCAN. It should be declared with
length equal to the length of MARKS. It must
be prepared for use by the routine SCANPR.
PNTERS is a specially structured array of integers that
describes the array MARKS. It is must be filled
in by the routine SCANPR. It should be declared
by the calling program as shown here:
INTEGER PNTERS ( RCHARS )
RCHARS is given by the expression
MAX - MIN + 5
where
MAX is the maximum value of ICHAR(MARKS(I)(1:1))
over the range I = 1, NMARKS
MIN is the minimum value of ICHAR(MARKS(I)(1:1))
over the range I = 1, NMARKS
See SCANPR for a more detailed description of the
declaration of PNTERS.
ROOM is the amount of space available for storing the
results of scanning the string.
START is the position from which scanning should commence.
Values of START less than 1 are treated as 1.
Detailed_Output
START is the position from which scanning should continue
in order to fully scan STRING (if sufficient memory was
not provided in BEG, END, and IDENT on the current
call to SCAN).
NTOKNS is the number of substrings identified in the current
scan of STRING.
BEG beginnings of scanned substrings. This should be
declared so that it is at least as large as ROOM.
END endings of scanned substrings. This should be declared
so that it is at least as large as ROOM.
IDENT positions of scanned substring within array MARKS.
If the substring STRING(BEG(I):END(I)) is in the array
MARKS, then MARKS(IDENT(I)) will equal
STRING(BEG(I):END(I)).
If the substring STRING(BEG(I):END(I)) is not in the
list of MARKS then IDENT(I) will have the value 0.
IDENT should be declared so that it can contain at least
ROOM integers.
Parameters
None.
Exceptions
Error free.
1) A space is regarded as a special mark. If MARKS(I) = ' ',
then MARKS(I) will match any consecutive sequence of blanks.
2) If START is less than 1 on input, it will be treated as
if it were 1.
3) If START is greater than the length of the string, no
tokens will be found and the value of START will return
unchanged.
Files
None.
Particulars
This routine allows you to scan a string and partition it into
recognized and unrecognized substrings.
For some applications the recognized substrings serve only as
delimiters between the portions of the string
that are of interest to your application. For other
applications the recognized substrings are equally important as
they may indicate operations that are to be performed on the
unrecognized portions of the string. However, the techniques
required to scan the string are the same in both instances. The
examples below illustrate some common situations.
Examples
Example 1.
----------
Suppose you wished to write a routine that would return the words
of a string. The following routine shows how SCANPR and SCAN can
be used to accomplish this task.
SUBROUTINE GETWDS ( STRING, WDROOM, NWORDS, WORDS )
CHARACTER*(*) STRING
INTEGER WDROOM
INTEGER NWORDS
CHARACTER*(*) WORDS ( * )
CHARACTER*(1) MARKS ( 1 )
INTEGER MRKLEN ( 1 )
INTEGER PNTERS ( 5 )
INTEGER ROOM
PARAMETER ( ROOM = 50 )
INTEGER BEG ( ROOM )
INTEGER END ( ROOM )
INTEGER I
INTEGER IDENT ( ROOM )
INTEGER NMARKS
INTEGER NTOKNS
INTEGER START
LOGICAL FIRST
SAVE FIRST
DATA FIRST / .TRUE. /
On the first time through the routine, set up the MARKS
MRKLEN, and PNTERS arrays.
IF( FIRST ) THEN
FIRST = .FALSE.
MARKS(1) = ' '
NMARKS = 1
CALL SCANPR ( NMARKS, MARKS, MRKLEN, PNTERS )
END IF
Now simply scan the input string for words until we have
them all or until we run out of room.
START = 1
NWORDS = 0
CALL SCAN ( STRING,
MARKS, MRKLEN, PNTERS, ROOM, START,
NTOKNS, IDENT, BEG, END )
If we found something in our scan, copy the substrings into the
words array.
DO WHILE ( ( NWORDS .LT. WDROOM )
. .AND. ( NTOKNS .GT. 0 ) )
Step through the scanned substrings, looking for those
that are not blank ...
I = 1
DO WHILE ( ( NWORDS .LT. WDROOM )
. .AND. ( I .LE. NTOKNS ) )
Copy the non-blank substrings (those unidentified by
SCAN) into WORDS.
IF ( IDENT(I) .EQ. 0 ) THEN
NWORDS = NWORDS + 1
WORDS(NWORDS) = STRING(BEG(I):END(I))
END IF
I = I + 1
END DO
Scan the STRING again for any substrings that might
remain. Note that START is already pointing at the
point in the string from which to resume scanning.
CALL SCAN ( STRING,
MARKS, MRKLEN, PNTERS, ROOM, START,
NTOKNS, IDENT, BEG, END )
END DO
That's all, we've got all the substrings there were (or
that we had room for).
RETURN
Example 2.
----------
To parse an algebraic expression such as
( X + Y ) * ( 2*Z + SIN(W) ) ** 2
You would select '**', '*', '+', '-', '(', ')' and ' '
to be the markers. Note that all of these begin with one
of the characters in the string ' !"#$%&''()*+,-./'
so that we can declare PNTERS to have length 20.
Prepare the MARKS, MRKLEN, and PNTERS.
CHARACTER*(4) MARKS
INTEGER NMARKS ( 8 )
INTEGER MRKLEN ( 8 )
INTEGER PNTERS ( 20 )
INTEGER ROOM
PARAMETER ( ROOM = 20 )
INTEGER NTOKNS
INTEGER BEG ( ROOM )
INTEGER END ( ROOM )
INTEGER IDENT ( ROOM )
LOGICAL FIRST
SAVE FIRST
SAVE MARKS
SAVE MRKLEN
SAVE PNTERS
DATA FIRST / .TRUE. /
IF ( FIRST ) THEN
MARKS(1) = '('
MARKS(2) = ')'
MARKS(3) = '+'
MARKS(4) = '-'
MARKS(5) = '*'
MARKS(6) = '/'
MARKS(7) = '**'
MARKS(8) = ' '
NMARKS = 8
CALL SCANPR ( NMARKS, MARKS, MRKLEN, PNTERS )
BLANK = BSRCHC ( ' ', NMARKS, MARKS )
END IF
Once all of the initializations are out of the way,
we can scan an input string.
CALL SCAN ( STRING, MARKS, MRKLEN, PNTERS, ROOM,
. START, NTOKNS, IDENT, BEG, END )
Next eliminate any white space that was returned in the
list of tokens.
KEPT = 0
DO I = 1, NTOKNS
IF ( IDENT(I) .NE. BLANK ) THEN
KEPT = KEPT + 1
BEG (KEPT) = BEG(I)
END (KEPT) = END(I)
IDENT(KEPT) = IDENT(I)
END IF
END DO
Now all of the substrings remaining point to grouping symbols,
operators, functions, or variables. Given that the individual
"words" of the expression are now in hand, the meaning of the
expression is much easier to determine.
The rest of the routine is left as a non-trivial exercise
for the reader.
Restrictions
1) The arrays MARKS, MRKLEN, and PNTERS must be prepared by the
routine SCANPR prior to supplying them for use by SCAN.
Literature_References
None.
Author_and_Institution
J. Diaz del Rio (ODC Space)
W.L. Taber (JPL)
Version
SPICELIB Version 1.1.0, 26-OCT-2021 (JDR)
Added IMPLICIT NONE statement.
Edited the header to comply with NAIF standard.
SPICELIB Version 1.0.0, 26-JUL-1996 (WLT)
|
Fri Dec 31 18:36:45 2021