scan

Index of Functions: A B C D E F G H I J K L M N O P Q R S T U V W X

Index Page

scan

Table of contents

Procedure
Abstract
Required_Reading
Keywords
Declarations
Brief_I/O

Detailed_Input
Detailed_Output
Parameters
Exceptions
Files
Particulars

Examples
Restrictions
Literature_References
Author_and_Institution
Version

Procedure

     SCAN ( Scan a string for tokens )

     ENTRY SCAN   ( STRING,
    .               MARKS,  MRKLEN, PNTERS, ROOM,   START,
    .               NTOKNS, IDENT,  BEG,    END            )

Abstract

     Scan a string and return the beginnings and ends of recognized
     and unrecognized substrings. The full collection of these
     substrings partitions the string.

Required_Reading

     None.

Keywords

     PARSING

Declarations

    CHARACTER*(*)         STRING
    CHARACTER*(*)         MARKS   ( * )
    INTEGER               MRKLEN  ( * )
    INTEGER               PNTERS  ( * )
    INTEGER               ROOM
    INTEGER               START
    INTEGER               NTOKNS
    INTEGER               BEG     ( * )
    INTEGER               END     ( * )
    INTEGER               IDENT   ( * )

Brief_I/O

     VARIABLE  I/O  DESCRIPTION
     --------  ---  --------------------------------------------------
     STRING     I   string to be scanned.
     MARKS      I   recognizable substrings.
     MRKLEN     I   an auxiliary array describing MARKS.
     PNTERS     I   an auxiliary array describing MARKS.
     ROOM       I   space available for storing substring descriptions.
     START     I-O  position from which to begin/resume scanning.
     NTOKNS     O   number of scanned substrings.
     BEG        O   beginnings of scanned substrings.
     END        O   endings of scanned substrings.
     IDENT      O   position of scanned substring within array MARKS.

Detailed_Input

     STRING   is any character string that is to be scanned
              to locate recognized and unrecognized substrings.

     MARKS    is an array of marks that will be recognized
              by the scanning routine. This array must be prepared
              by calling the routine SCANPR.

              Note that the blank string is interpreted
              in a special way by SCAN. If the blank character,
              ' ', is one of the MARKS, it will match any unbroken
              sequence of blanks in string.  Thus if ' ' is the only
              marks supplied and STRING is

                 'A   lot of      space '
                  ......................

              Then scan will locate the following substrings

              'A'          STRING(1:1)    (unrecognized)
              '   '        STRING(2:4)    (recognized --- all blanks)
              'lot'        STRING(5:7)    (unrecognized)
              ' '          STRING(8:8)    (recognized --- a blank)
              'of'         STRING(9:10)   (unrecognized)
              '      '     STRING(11:16)  (recognized --- all blanks)
              'space'      STRING(17:21)  (unrecognized)
              ' '          STRING(22:22)  (recognized --- a blank)

     MRKLEN   is an auxiliary array populated by SCANPR
              for use by SCAN. It should be declared with
              length equal to the length of MARKS. It must
              be prepared for use by the routine SCANPR.

     PNTERS   is a specially structured array of integers that
              describes the array MARKS. It is must be filled
              in by the routine SCANPR. It should be declared
              by the calling program as shown here:

                 INTEGER  PNTERS ( RCHARS )

              RCHARS is given by the expression

                MAX - MIN + 5

              where

              MAX is the maximum value of ICHAR(MARKS(I)(1:1))
                  over the range I = 1, NMARKS

              MIN is the minimum value of ICHAR(MARKS(I)(1:1))
                  over the range I = 1, NMARKS

              See SCANPR for a more detailed description of the
              declaration of PNTERS.

     ROOM     is the amount of space available for storing the
              results of scanning the string.

     START    is the position from which scanning should commence.
              Values of START less than 1 are treated as 1.

Detailed_Output

     START    is the position from which scanning should continue
              in order to fully scan STRING (if sufficient memory was
              not provided in BEG, END, and IDENT on the current
              call to SCAN).

     NTOKNS   is the number of substrings identified in the current
              scan of STRING.

     BEG      beginnings of scanned substrings. This should be
              declared so that it is at least as large as ROOM.

     END      endings of scanned substrings. This should be declared
              so that it is at least as large as ROOM.

     IDENT    positions of scanned substring within array MARKS.
              If the substring STRING(BEG(I):END(I)) is in the array
              MARKS, then MARKS(IDENT(I)) will equal
              STRING(BEG(I):END(I)).

              If the substring STRING(BEG(I):END(I)) is not in the
              list of MARKS then IDENT(I) will have the value 0.

              IDENT should be declared so that it can contain at least
              ROOM integers.

Parameters

     None.

Exceptions

     Error free.

     1)  A space is regarded as a special mark. If MARKS(I) = ' ',
         then MARKS(I) will match any consecutive sequence of blanks.

     2)  If START is less than 1 on input, it will be treated as
         if it were 1.

     3)  If START is greater than the length of the string, no
         tokens will be found and the value of START will return
         unchanged.

Files

     None.

Particulars

     This routine allows you to scan a string and partition it into
     recognized and unrecognized substrings.

     For some applications the recognized substrings serve only as
     delimiters between the portions of the string
     that are of interest to your application. For other
     applications the recognized substrings are equally important as
     they may indicate operations that are to be performed on the
     unrecognized portions of the string. However, the techniques
     required to scan the string are the same in both instances. The
     examples below illustrate some common situations.

Examples

     Example 1.
     ----------

     Suppose you wished to write a routine that would return the words
     of a string. The following routine shows how SCANPR and SCAN can
     be used to accomplish this task.

        SUBROUTINE GETWDS ( STRING, WDROOM, NWORDS, WORDS )

        CHARACTER*(*)      STRING
        INTEGER            WDROOM
        INTEGER            NWORDS
        CHARACTER*(*)      WORDS  ( * )


        CHARACTER*(1)      MARKS  ( 1 )
        INTEGER            MRKLEN ( 1 )
        INTEGER            PNTERS ( 5 )

        INTEGER            ROOM
        PARAMETER        ( ROOM = 50 )

        INTEGER            BEG   ( ROOM )
        INTEGER            END   ( ROOM )
        INTEGER            I
        INTEGER            IDENT ( ROOM )
        INTEGER            NMARKS
        INTEGER            NTOKNS
        INTEGER            START

        LOGICAL            FIRST
        SAVE               FIRST
        DATA               FIRST  / .TRUE. /


        On the first time through the routine, set up the MARKS
        MRKLEN, and PNTERS arrays.

        IF( FIRST ) THEN

           FIRST    = .FALSE.
           MARKS(1) = ' '
           NMARKS   = 1

           CALL SCANPR ( NMARKS, MARKS, MRKLEN, PNTERS )

        END IF

        Now simply scan the input string for words until we have
        them all or until we run out of room.

        START  = 1
        NWORDS = 0

        CALL SCAN ( STRING,
                    MARKS,  MRKLEN, PNTERS, ROOM, START,
                    NTOKNS, IDENT,  BEG,    END          )

        If we found something in our scan, copy the substrings into the
        words array.

        DO WHILE (       ( NWORDS .LT. WDROOM )
       .           .AND. ( NTOKNS .GT. 0      ) )


           Step through the scanned substrings, looking for those
           that are not blank ...

           I = 1

           DO WHILE (       ( NWORDS .LT. WDROOM )
          .           .AND. ( I      .LE. NTOKNS ) )

              Copy the non-blank substrings (those unidentified by
              SCAN) into WORDS.

              IF ( IDENT(I) .EQ. 0 ) THEN
                 NWORDS        = NWORDS + 1
                 WORDS(NWORDS) = STRING(BEG(I):END(I))
              END IF

              I      = I      + 1

           END DO


           Scan the STRING again for any substrings that might
           remain. Note that START is already pointing at the
           point in the string from which to resume scanning.

           CALL SCAN ( STRING,
                       MARKS,  MRKLEN, PNTERS, ROOM, START,
                       NTOKNS, IDENT,  BEG,    END          )
        END DO

        That's all, we've got all the substrings there were (or
        that we had room for).

        RETURN


     Example 2.
     ----------

     To parse an algebraic expression such as

        ( X + Y ) * ( 2*Z + SIN(W) ) ** 2

     You would select '**', '*', '+', '-', '(', ')' and ' '
     to be the markers. Note that all of these begin with one
     of the characters in the string ' !"#$%&''()*+,-./'
     so that we can declare PNTERS to have length 20.

     Prepare the MARKS, MRKLEN, and PNTERS.

        CHARACTER*(4)         MARKS
        INTEGER               NMARKS ( 8  )
        INTEGER               MRKLEN ( 8  )
        INTEGER               PNTERS ( 20 )

        INTEGER               ROOM
        PARAMETER           ( ROOM = 20 )

        INTEGER               NTOKNS
        INTEGER               BEG    ( ROOM )
        INTEGER               END    ( ROOM )
        INTEGER               IDENT  ( ROOM )

        LOGICAL               FIRST
        SAVE                  FIRST
        SAVE                  MARKS
        SAVE                  MRKLEN
        SAVE                  PNTERS

        DATA                  FIRST  / .TRUE. /

        IF ( FIRST ) THEN

           MARKS(1) = '('
           MARKS(2) = ')'
           MARKS(3) = '+'
           MARKS(4) = '-'
           MARKS(5) = '*'
           MARKS(6) = '/'
           MARKS(7) = '**'
           MARKS(8) = ' '

           NMARKS   = 8

           CALL SCANPR ( NMARKS, MARKS, MRKLEN, PNTERS )

           BLANK = BSRCHC ( ' ', NMARKS, MARKS )

        END IF


        Once all of the initializations are out of the way,
        we can scan an input string.

        CALL SCAN ( STRING, MARKS,  MRKLEN, PNTERS, ROOM,
       .            START,  NTOKNS, IDENT,  BEG,    END  )


        Next eliminate any white space that was returned in the
        list of tokens.

        KEPT = 0

        DO I = 1, NTOKNS

           IF ( IDENT(I) .NE. BLANK ) THEN

              KEPT        = KEPT + 1
              BEG  (KEPT) = BEG(I)
              END  (KEPT) = END(I)
              IDENT(KEPT) = IDENT(I)

           END IF

        END DO

        Now all of the substrings remaining point to grouping symbols,
        operators, functions, or variables. Given that the individual
        "words" of the expression are now in hand, the meaning of the
        expression is much easier to determine.

        The rest of the routine is left as a non-trivial exercise
        for the reader.

Restrictions

     1)  The arrays MARKS, MRKLEN, and PNTERS must be prepared by the
         routine SCANPR prior to supplying them for use by SCAN.

Literature_References

     None.

Author_and_Institution

     J. Diaz del Rio    (ODC Space)
     W.L. Taber         (JPL)

Version

    SPICELIB Version 1.1.0, 26-OCT-2021 (JDR)

        Added IMPLICIT NONE statement.

        Edited the header to comply with NAIF standard.

    SPICELIB Version 1.0.0, 26-JUL-1996 (WLT)

Fri Dec 31 18:36:45 2021