A PROTOTYPE FOR ADAPTIVE ASSOCIATION OF STREET NAMES WITH STREETS ON MAPS

Page created by Zachary Simon
 
CONTINUE READING
A PROTOTYPE FOR ADAPTIVE ASSOCIATION OF
     STREET NAMES WITH STREETS ON MAPS
              G. Nagy1 , A. Samal2, S. Seth2 , T. Fisher2, E. Guthmann2,
             K. Kalafala1 , L. Li2 , P. Sarkar1, S. Sivasubramaniam1, Y. Xu1
                 1
                     Rensselaer Polytechnic Institute, Troy, NY 12180 USA
                      2
                        University of Nebraska, Lincoln, NE 68502, USA
                                   email: nagy@ecse.rpi.edu

      We present work in progress on the development of a partially-automated system for inter-
      pretation of map images. The principal aim of our project is to demonstrate that an adap-
      tive system can decrease operator intervention with increasing conversion volume. The
      extracted information is evaluated against an established database, and a cost model is con-
      structed to evaluate the entire conversion task.

1 Overview
    We report the status and results from a recently initiated project on interpretation of map
images. Our immediate goal is to identify street lines and street names in a scanned USGS to-
pographic quadrangle and produce a database of associations between them.
    The project involves several challenging tasks that are beyond the current state of the art
in automated map conversion. We plan to automate gradually the many functions required, be-
ginning with the most time-consuming aspects of manual digitization. Small sections of the
map will be converted first, and an operator will correct any errors. These operator actions will
provide feedback to automatically alter the parameters of the processing algorithms.
    The logged operator interventions will also assist us in the preparation of a cost model. Our
emphasis throughout is on techniques that facilitate the batch conversion of maps of the same
type. While the conversion of the first map in the batch may require significant operator in-
tervention, subsequent maps will benefit from adaptive techniques based on the graphic and
typesetting consistency within the batch.
    We build on considerable previous work. Vectorization under document-specific constraints
and “beautification” are discussed in [1, 2, 3, 4], street-line and street-name extraction in [5,
6], and the separation of text and line art in [7, 8]. Excellent overviews of model-based map
interpretation can be found in two recent dissertations [9, 10]. We also believe that over the last
two decades much cartographic and image processing expertise has been built into our principal
software tool, ARC/INFO [11].
    The data flow through the conversion process is illustrated in Figure 1. The registration of
the map to geodesic coordinates is required only for evaluation against the reference data. Color
separation is currently performed using the hue component of the HSV (Hue, Saturation, and
Value) model.
    The separation of the black sublayer into street line, street label, and “other” is a difficult
task. Contextual cues must be used to differentiate small building icons, traffic islands, and in-
dividual characters. Current commercial OCR (Optical Character Recognition) systems cannot
Washington DC East Quad
                                                 Map Image
               Scanning

                                                  Registration

                                                                            Other Layers
                                               Color Separation      (Water, Vegetation,
                                                                           Built-up Areas, Elevations )
             Street Index                                                                                      Future
                                              Black          Red
                                              Layer          Layer                                             Work
                                              Sublayer Separation                                 Graphics

                                                                  Line
                                       Text
                                                                 Vectorization
                              Text Processing                                                    Other Text

                                                                                                 Other Lines
                                                           Line Processing
                                   Street
                                   Names
                                                                  Street Lines

                                                  Street Name
                                                  Association
                                                                                 Expert System
            USGS DLG Files
            TIGER/line Files                    Post-processing
                                                & Verification

                                   Evaluation

                            Residual     Conversion         Output
                             Errors        Cost            DLG Files

                                  Figure 1: Schematic of data flow

recognize street names with sufficient accuracy. Street names are therefore recognized using
domain-specific character prototypes obtained from a few operator-labeled words [12, 13].
     The initial vectorization is performed by ArcScan. The resulting line segment configura-
tion is evaluated according to map-specific constraints, and corrections are made on the basis
of detailed analysis of the nearby foreground pixels, line segment connectivity and orientation,
and intersection topology.
     The street-line/street-name association is accomplished by determining the line segment
nearest to and best aligned with every street-name bounding box, then tracing the chains of par-
allel street polylines to their logical termination.
     Finally, the segment of map that has already been processed is submitted to the operator.
Errors are corrected using a GUI (Graphic User Interface) to ARC/INFO. The system keeps a
detailed log and associates every correction with the responsible algorithm.
     Every one of the processing algorithms described above has several parameters that are ini-
tialized to default values according to experience from previous maps. The log of operator cor-
rections provides the opportunity to change these values. Each change is checked against the al-
ready existing “correct” database. Some of the steps, such as vectorization and line processing,
also have internal (automatic) consistency checks that are used to provide feedback to previous
stages.
     The verified database of street name associations is compared to TIGER (Topologically In-
tegrated Geographic Encoding and Referencing) and DLG (Digital Line Graph) files for eval-
uation. The cost-benefit ratio of residual error to operator time and computing resources forms
the basis of a model for predicting the cost of new conversion tasks.

2 Image Acquisition and Processing
    The map image was provided to us on a CDROM by the US Government’s National Im-
agery and Mapping Agency (NIMA). The high quality image was obtained by scanning a 7.50
USGS topographic litho of Washington DC East at 1000 dpi and 24 bits/pixel. The uncom-
pressed image occupies over one Gigabyte of storage. After compression, this reduces to about
70 Megabytes, still a formidable size for storage and processing. Under the circumstance, sub-
sampling, say, to 250 dpi is a tempting preprocessing alternative.
    Figure 2(a) shows a section of the map and identical pieces from this section are shown at
1000 and 250 dpi resolutions in Figures 2(b) through 2(e) for illustration. At the lower resolu-
tion, character shapes may get distorted (Figure 2(e), extra gaps may be introduced in a street
line where it is particularly thin (Figure 2(c)), and overlapping glyphs may be hard to distinguish
from parallel street lines (Figures 2(b) and 2(d)). Intersections can be identified reliably pro-
vided fine detail is preserved on street-line intersections. The example in Figure 2(d) shows this
not to be true for the 250 dpi image. There is significant distortion in the regularity of halftone
patterns at the lower resolution.

                      Figure 2: 1000dpi and 250dpi images compared.

     In our view, the above disadvantages of image subsampling outweigh the potential prob-
lems of working with the high-resolution image. A suite of tools, based on the public-domain
TIFF library [14], was developed to overcome the cumbersome task of accessing sections of
large image files. The image is stored in a tiled format as a mosaic of square tiles. The tile
size was kept small (1/200 or 512-pixel square) so that each tile image could be viewed at full
resolution on computer displays, without distortion or need to scroll. Library routines allow ef-
ficient extraction of any rectangular region for display or processing. Additional routines allow
examination of arbitrary pixels within a tile in the black layer.
     The relative ease with which arbitrary map sections can be accessed by the tools suite meant
that, for each major image processing operation, we had the choice of doing it at the grid-cell
vs the whole-image level. In some cases, e.g. connected-component analysis, it is cumbersome
to integrate the data from cell-level analysis to the whole image. In other cases, particularly
vectorization and text recognition, cell-level analysis is a viable option.

3 Sublayer Separation
     Our strategy for sublayer separation is an initial classification based on connected-component
(CC) analysis, and subsequent improvement by contextual analysis. The initial classification
aims to distinguish between CCs for Lines, isolated character blocks (Text), and a catch-all class
denoting everything else as Icons. Overlapping objects of different classes would obviously be
misclassified in this CC-based labeling. The contextual analysis is expected to catch the bulk
of errors of this type, and refine the Lines class further into streets, grid-lines, railroads, etc. At
the time of this writing, we have completed only the initial classification of the black layer.
     Using standard techniques, our CC algorithm found the four-connected components in two
complete scans of the compressed black-layer image (about 20 Megabytes) and labeled each
foreground pixel with its CC number. The total CPU time on a Sparc-20 server was 13.5 min-
utes.
     The decision rule for the initial classification was derived from the ground-truth data for a
3 x300 section of the map. It is based on two attributes of the rectangular (horizontal) bounding
  00

box (BB) of each CC. The parameters in the decision rule were chosen to minimize the classi-
fication error in the ground truth.

Figure 3: Results of classification on a small section of the map. Counterclockwise from
the top-left, the four image panels show, respectively, the black layer, Lines, Icons and
Text.

    The four panels in Figure 3 show the classification graphically for a part of the 300x300 tile. In
the class identified as Text, there are both erroneous lines (small polygonal shapes at odd-shaped
intersections (quite common in DC!) and erroneous icons. We expect many of the errors of
omission and commission to be corrected by contextual analysis. Most of the touching glyphs in
Table 1: Initial Sublayer Classification
                                                   Assigned Label
                      Object Class      Text       Lines     Icons        Total
                              Text      89.7%        5.6%     4.7%        234
                             Lines       0.0%       97.4%     2.6%         115
                             Icons       1.6%        8.9%    89.5%        124

the line layer can be identified as missing pieces of the strings in the Text class. ArcScan’s ability
to bypass overlapping glyphs during vectorization of street lines can also be used to advantage.
The parameters used in the classification rule and contextual analysis are subject to adaptation.
     A summary of the errors for the 3 00x300 tile appears in Table 1 . The 24 BBs that were smaller
than a minimum pixel size (10x10) were considered to be noise and not classified.

4 Vectorization
    An example of the quality of the vectorization performed by ArcScan on the tile of Figure 3
is shown in Figure 4(left). This is better than the several vectorization programs that we have
developed. The objective of line-processing is to improve it further, both for accurate location of
the streets with respect to the DLG and to facilitate street-name association, as shown in Figure
4(right).

          Figure 4: Left: ArcScan vectorization; Right: Improved vectorization.

    The first step is to run several checks on the line segments, including histograms of line-
lengths and node-degrees. On the street layer, good vectorization is characterized by few very
short line segments and few nodes of degree other than 2. Next, the foreground pixels near
the line segments are checked for line thickness, and each line segment whose median line-
thickness is different from that expected is flagged. Line segments are mated to parallel line seg-
ments within the appropriate range of street-width spacing and unmated segments are flagged.
    Our measures of vectorization quality easily distinguish between the two vectorizations in
Figure 4. ArcScan requires setting 17 parameters for its vectorization routine. In the case of
suspect vectorization, as indicated by the quality measures, these parameters will be readjusted
according to rules stored in an expert system (EXSYS) [15].
    Polyline pairs of continuing streets (a chain of mated line segments) are denoted as curb
pairs. The two lines in a curb pair are assigned opposite orientations as follows: imagine the
two lines to represent opposite lanes of traffic in a two-lane street and assign each the direction
according to the US traffic flow convention.
The above steps allow robust separation of continuing street lines from other mistakenly
vectorized objects, but are not sufficient to correct errors in the representation of street intersec-
tions. This is important, because the street-name association depends on accurate representation
of the connectivity of the street network. Our definition of an intersection is that it is the area
bounded by the smallest closed curve with only curb pairs incident upon it (Figure 4(left)).
Note that a minor change in the line configuration may be sufficient to change an intersection
into two separate intersections (Figure 4(right)).

            Figure 5: Left: An intersection; Right: Two separate intersections.

    A canonical intersection is one where curb pairs are not connected within the intersection
but adjacent line segments from two curb pairs are connected within the intersection and have
opposite orientations (one directed toward and the other away from the intersection). This rule
requires some modification for underpasses and overpasses (where a street appears to be closed
by a crossing), and for cloverleaf highway intersections. The purpose of this analysis is to flag
intersections that are incorrectly vectorized and to correct them by reference to the underlying
foreground pixels.

5     Street-Name Association
    The street-name association is carried out by (1) determining the street-line closest to each
street name, (2) finding the curb pair that contains this street line, and (3) tracing the chain of
curb pairs in both directions as far as possible. The map characteristics underlying street-name
association are the following:

     The baseline of a street name is apposite and roughly parallel with the associated curb
      pair.

     Complete street names consist of one or two specific labels (VERMONT or 32ND) that
      precedes a single generic label (AVE. or ST.), and may be repeated along the street.

     Every street name is associated with a chain of curb pairs, but not every chain is associated
      with a street name. The USGS map does not show the names of “unimportant” streets.

     A chain of curb pairs that does not undergo abrupt direction changes retains the same
      name for its entire length unless curb pairs on it are associated with different labels. Name
      changes for continuing streets take place only at major intersections.

     The streets and highways form a network; there are no unreachable streets.

     Optionally, a gazetteer of street names with approximate grid locations is available to ver-
      ify the assignments.
Several steps of the street association algorithm require acceptance or rejection of alter-
natives, or a choice between several candidates. For instance, at a Y-junction, the algorithm
must make a choice between two branches, and tracing a street across a complex intersection
like a traffic circle is even more difficult. The above assumptions are therefore codified into a
parametrized EXSYS rule base. After the initial default assignment, the street-association pa-
rameters are modified, like other processing parameters, by feedback from the log of operator
corrections.

6 Post-processing, Evaluation and Cost Model
     The interface for interactive correction and verification consists of two separate parts: a sys-
tem controlled iterative session and an operator controlled editing session. The system con-
trolled session is based on the streets recognized by the automatic map conversion process. A
single street at a time is presented to the operator along with its associated name. The oper-
ator is then asked to either accept the data as shown or make corrections. During the operator
controlled session, the operator will be allowed to select any existing street line and perform cor-
rections. There is also an option to add missing lines that the automatic conversion may have
missed.
     Since one of the paramount objectives in this project is to demonstrate a decrease in the
amount of operator interaction needed by providing automatic conversion of the map image,
the system produces an operator activity log. This log contains the elapsed times for various
operator activities and allows subsequent analysis of the actual amount of operator time required
for conversion.
     The evaluation phase compares the output of our conversion process against the “ground
truth” of the map area. We make use of existing TIGER and DLG databases of the map area
as good approximations of this ground truth to compare against. The DLG dataset for our sam-
ple quadrangle was produced by the USGS by manual digitization of a newer version of the
same map that we have, with positional accuracy standards that 90% of points should be within
0.02 inches (400) of the position on the original map. This street line data has a high degree of
correspondence with our scanned map image (Figure 6(left)).
     The TIGER database, on the other hand, does not share the same origins as our map, and
maintains a much lower standard of positional accuracy (Figure 6(right)). This database does,
however, contain street names associated with the street vectors. We will use this to evaluate
the street name recognition and association.
     We will generate a variety of statistical measures to verify the completeness and accuracy of
our output. The bulk of this processing will be done using ARC/INFO. We can calculate several
measures relating to the street lines without manual intervention:

     Total length of the street network.

     Percentage of the street network within a specified distance of streets in the DLG data.

     Percentage of the DLG street network within a specified distance of streets in our output.

     Average distance from street intersections to the closest intersection in the DLG data.

    It is not so straightforward to evaluate the results of street name recognition and association.
Some information can be gained by pure comparison of the name output against TIGER data,
but to get a better categorization of errors will require human interaction. For those measures,
we will use a random sampling of street segments.
Figure 6: Left: DLG; Right: TIGER

    Percentage of streets correctly associated with names.

    Percentage of streets correctly associated, but only partially recognized names, and the
     degree to which the name is recognized.

    Percentage of named streets incorrectly associated or not associated with any name.

    The final summary of the evaluation phase will be a cost-benefit model. We will carry out
evaluations after various amounts of operator correction of the data and various amounts of con-
version volume. At one end of the scale is the effort required to manually enter the entire map,
while the other end is fully automated conversion. We will model the relationship between op-
erator intervention time and residual error, and investigate to what degree the various steps of
the process benefit from increased operator intervention.

Acknowledgment
    We gratefully acknowledge the support of the Intelligent Map Understanding Project of the
National Imagery and Mapping Agency. This work is also being supported by the University of
Nebraska-Lincoln, Center for Communication and Information Science. Part of this work was
carried out in New York State Center for Advanced Technology (CAT) in Automation, Robotics
and Manufacturing at Rensselaer Polytechnic Institute. The CAT is partially funded by a block
grant from the New York State Science and Technology Foundation. We thank Environmental
Systems Research Institute (ESRI) for software support.

References
 [1] T. Pavlidis, C. J. Van Wyk, An Automatic Beautifier for Drawings and Illustrations, Pro-
     ceedings ACM SIGGRAPH 1985, 225-234 (1985).

 [2] O. Hori, A. Okazaki, High Quality Vectorization Based on a Generic Object Model,
     Structured Document Image Analysis, Springer-Verlag, pp. 325-329 (1992).

 [3] M. Roosli, G. Monagan, Adding Geometric Constraints to the Vectorization of Line Draw-
     ings, Graphics Recognition, Springer, pp. 49-56 (1996).
[4] T. Kaneko, Line Structure Extraction From Engineering Drawings, Pattern Recognition
     Vol. 25, pp. 963-973 (1992).

 [5] G. Myers, P. Mulgaonkar, C. Chen, J. DeCurtins, E. Chen, Verification-Based Approach
     for Automated Text and Feature Extraction from Raster-Scanned Maps, Graphics Recog-
     nition, Springer, pp. 190-203 (1996).

 [6] D.A. Varley and M. Visvalingam, Road Extraction and Topographic Data Validation us-
     ing Area Topology, The Computer Journal, Vol. 37 No. 1, pp. 3-15 (1994).

 [7] L.A. Fletcher and R. Kasturi, A Robust Algorithm for Text String Separation from Mixed
     Text/Graphics Images, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.
     10 No. 6 pp. 910-918 (1998).

 [8] M.-K. Kim, M.-K. Park, O.-S.Kwon, Y.-B. Kwon Automatic Region Labeling of the Lay-
     ered Map, Graphics Recognition, Springer, pp. 179-189 (1996).

 [9] R.D.T. Janssen, The application of model-based image processing to the interpretation of
     maps, Doctoral Dissertation, Technical University of Delft (1995).

[10] J. Den Hartog, A framework for knowledge-based map interpretation, Doctoral Disser-
     tation, Technical University of Delft (1995).

[11] M. Zeiler, Inside ARC/INFO, Revised Edition OnWord Press, Santa Fe, NM (1997).

[12] G. Nagy, Y. Xu, Priming the Recognizer, Procs. DAS-96, Malvern, PA, pp. 263-281
     (1996).

[13] G. Nagy, Y. Xu, Automatic Prototype Extraction for OCR, accepted for presentation
     ICDAR-97, Ulm (1997).

[14] S. Leffler, libtiff software distribution Version 3.4, Copyright (c) Sam Leffler and Silicon
     Graphics, Inc. (1996). Online: ftp://ftp.sgi.com/graphics/tiff/tiff-v3.4-tar.gz .

[15] Multilogic Inc. EXSYS, http://www.multilogic.com/ .
You can also read