scan |
Table of contents
ProcedureSCAN ( Scan a string for tokens ) ENTRY SCAN ( STRING, . MARKS, MRKLEN, PNTERS, ROOM, START, . NTOKNS, IDENT, BEG, END ) AbstractScan a string and return the beginnings and ends of recognized and unrecognized substrings. The full collection of these substrings partitions the string. Required_ReadingNone. KeywordsPARSING DeclarationsCHARACTER*(*) STRING CHARACTER*(*) MARKS ( * ) INTEGER MRKLEN ( * ) INTEGER PNTERS ( * ) INTEGER ROOM INTEGER START INTEGER NTOKNS INTEGER BEG ( * ) INTEGER END ( * ) INTEGER IDENT ( * ) Brief_I/OVARIABLE I/O DESCRIPTION -------- --- -------------------------------------------------- STRING I string to be scanned. MARKS I recognizable substrings. MRKLEN I an auxiliary array describing MARKS. PNTERS I an auxiliary array describing MARKS. ROOM I space available for storing substring descriptions. START I-O position from which to begin/resume scanning. NTOKNS O number of scanned substrings. BEG O beginnings of scanned substrings. END O endings of scanned substrings. IDENT O position of scanned substring within array MARKS. Detailed_InputSTRING is any character string that is to be scanned to locate recognized and unrecognized substrings. MARKS is an array of marks that will be recognized by the scanning routine. This array must be prepared by calling the routine SCANPR. Note that the blank string is interpreted in a special way by SCAN. If the blank character, ' ', is one of the MARKS, it will match any unbroken sequence of blanks in string. Thus if ' ' is the only marks supplied and STRING is 'A lot of space ' ...................... Then scan will locate the following substrings 'A' STRING(1:1) (unrecognized) ' ' STRING(2:4) (recognized --- all blanks) 'lot' STRING(5:7) (unrecognized) ' ' STRING(8:8) (recognized --- a blank) 'of' STRING(9:10) (unrecognized) ' ' STRING(11:16) (recognized --- all blanks) 'space' STRING(17:21) (unrecognized) ' ' STRING(22:22) (recognized --- a blank) MRKLEN is an auxiliary array populated by SCANPR for use by SCAN. It should be declared with length equal to the length of MARKS. It must be prepared for use by the routine SCANPR. PNTERS is a specially structured array of integers that describes the array MARKS. It is must be filled in by the routine SCANPR. It should be declared by the calling program as shown here: INTEGER PNTERS ( RCHARS ) RCHARS is given by the expression MAX - MIN + 5 where MAX is the maximum value of ICHAR(MARKS(I)(1:1)) over the range I = 1, NMARKS MIN is the minimum value of ICHAR(MARKS(I)(1:1)) over the range I = 1, NMARKS See SCANPR for a more detailed description of the declaration of PNTERS. ROOM is the amount of space available for storing the results of scanning the string. START is the position from which scanning should commence. Values of START less than 1 are treated as 1. Detailed_OutputSTART is the position from which scanning should continue in order to fully scan STRING (if sufficient memory was not provided in BEG, END, and IDENT on the current call to SCAN). NTOKNS is the number of substrings identified in the current scan of STRING. BEG beginnings of scanned substrings. This should be declared so that it is at least as large as ROOM. END endings of scanned substrings. This should be declared so that it is at least as large as ROOM. IDENT positions of scanned substring within array MARKS. If the substring STRING(BEG(I):END(I)) is in the array MARKS, then MARKS(IDENT(I)) will equal STRING(BEG(I):END(I)). If the substring STRING(BEG(I):END(I)) is not in the list of MARKS then IDENT(I) will have the value 0. IDENT should be declared so that it can contain at least ROOM integers. ParametersNone. ExceptionsError free. 1) A space is regarded as a special mark. If MARKS(I) = ' ', then MARKS(I) will match any consecutive sequence of blanks. 2) If START is less than 1 on input, it will be treated as if it were 1. 3) If START is greater than the length of the string, no tokens will be found and the value of START will return unchanged. FilesNone. ParticularsThis routine allows you to scan a string and partition it into recognized and unrecognized substrings. For some applications the recognized substrings serve only as delimiters between the portions of the string that are of interest to your application. For other applications the recognized substrings are equally important as they may indicate operations that are to be performed on the unrecognized portions of the string. However, the techniques required to scan the string are the same in both instances. The examples below illustrate some common situations. ExamplesExample 1. ---------- Suppose you wished to write a routine that would return the words of a string. The following routine shows how SCANPR and SCAN can be used to accomplish this task. SUBROUTINE GETWDS ( STRING, WDROOM, NWORDS, WORDS ) CHARACTER*(*) STRING INTEGER WDROOM INTEGER NWORDS CHARACTER*(*) WORDS ( * ) CHARACTER*(1) MARKS ( 1 ) INTEGER MRKLEN ( 1 ) INTEGER PNTERS ( 5 ) INTEGER ROOM PARAMETER ( ROOM = 50 ) INTEGER BEG ( ROOM ) INTEGER END ( ROOM ) INTEGER I INTEGER IDENT ( ROOM ) INTEGER NMARKS INTEGER NTOKNS INTEGER START LOGICAL FIRST SAVE FIRST DATA FIRST / .TRUE. / On the first time through the routine, set up the MARKS MRKLEN, and PNTERS arrays. IF( FIRST ) THEN FIRST = .FALSE. MARKS(1) = ' ' NMARKS = 1 CALL SCANPR ( NMARKS, MARKS, MRKLEN, PNTERS ) END IF Now simply scan the input string for words until we have them all or until we run out of room. START = 1 NWORDS = 0 CALL SCAN ( STRING, MARKS, MRKLEN, PNTERS, ROOM, START, NTOKNS, IDENT, BEG, END ) If we found something in our scan, copy the substrings into the words array. DO WHILE ( ( NWORDS .LT. WDROOM ) . .AND. ( NTOKNS .GT. 0 ) ) Step through the scanned substrings, looking for those that are not blank ... I = 1 DO WHILE ( ( NWORDS .LT. WDROOM ) . .AND. ( I .LE. NTOKNS ) ) Copy the non-blank substrings (those unidentified by SCAN) into WORDS. IF ( IDENT(I) .EQ. 0 ) THEN NWORDS = NWORDS + 1 WORDS(NWORDS) = STRING(BEG(I):END(I)) END IF I = I + 1 END DO Scan the STRING again for any substrings that might remain. Note that START is already pointing at the point in the string from which to resume scanning. CALL SCAN ( STRING, MARKS, MRKLEN, PNTERS, ROOM, START, NTOKNS, IDENT, BEG, END ) END DO That's all, we've got all the substrings there were (or that we had room for). RETURN Example 2. ---------- To parse an algebraic expression such as ( X + Y ) * ( 2*Z + SIN(W) ) ** 2 You would select '**', '*', '+', '-', '(', ')' and ' ' to be the markers. Note that all of these begin with one of the characters in the string ' !"#$%&''()*+,-./' so that we can declare PNTERS to have length 20. Prepare the MARKS, MRKLEN, and PNTERS. CHARACTER*(4) MARKS INTEGER NMARKS ( 8 ) INTEGER MRKLEN ( 8 ) INTEGER PNTERS ( 20 ) INTEGER ROOM PARAMETER ( ROOM = 20 ) INTEGER NTOKNS INTEGER BEG ( ROOM ) INTEGER END ( ROOM ) INTEGER IDENT ( ROOM ) LOGICAL FIRST SAVE FIRST SAVE MARKS SAVE MRKLEN SAVE PNTERS DATA FIRST / .TRUE. / IF ( FIRST ) THEN MARKS(1) = '(' MARKS(2) = ')' MARKS(3) = '+' MARKS(4) = '-' MARKS(5) = '*' MARKS(6) = '/' MARKS(7) = '**' MARKS(8) = ' ' NMARKS = 8 CALL SCANPR ( NMARKS, MARKS, MRKLEN, PNTERS ) BLANK = BSRCHC ( ' ', NMARKS, MARKS ) END IF Once all of the initializations are out of the way, we can scan an input string. CALL SCAN ( STRING, MARKS, MRKLEN, PNTERS, ROOM, . START, NTOKNS, IDENT, BEG, END ) Next eliminate any white space that was returned in the list of tokens. KEPT = 0 DO I = 1, NTOKNS IF ( IDENT(I) .NE. BLANK ) THEN KEPT = KEPT + 1 BEG (KEPT) = BEG(I) END (KEPT) = END(I) IDENT(KEPT) = IDENT(I) END IF END DO Now all of the substrings remaining point to grouping symbols, operators, functions, or variables. Given that the individual "words" of the expression are now in hand, the meaning of the expression is much easier to determine. The rest of the routine is left as a non-trivial exercise for the reader. Restrictions1) The arrays MARKS, MRKLEN, and PNTERS must be prepared by the routine SCANPR prior to supplying them for use by SCAN. Literature_ReferencesNone. Author_and_InstitutionJ. Diaz del Rio (ODC Space) W.L. Taber (JPL) VersionSPICELIB Version 1.1.0, 26-OCT-2021 (JDR) Added IMPLICIT NONE statement. Edited the header to comply with NAIF standard. SPICELIB Version 1.0.0, 26-JUL-1996 (WLT) |
Fri Dec 31 18:36:45 2021