CDI DATAONE AND SCIENCEBASE ACCESS POINT EXPANSION: PYTHON APPLICATIONS PROGRAMMING INTERFACE AND ARCGIS TOOLKIT DEVELOPMENT

Page created by Heather Chapman
 
CONTINUE READING
CDI SSF Category 2:
                               Computational Tools and Services

                 CDI DataONE and ScienceBase Access Point
                Expansion: Python Applications Programming
                 Interface and ArcGIS Toolkit Development
                          Applicants/Principal Investigator(s):
  Mike Mulligan, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO
                  80225. Ph. (303) 202- 4242 Email mmulligan@usgs.gov
  Tim Mancuso, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver CO
                  80225. Ph. (303) 202- 4238 Email tmancuso@usgs.gov

                                              Abstract:
In the last several years the USGS has either sponsored or partnered with other groups to
develop several enterprise data management support systems. Two of these projects, DataONE
and ScienceBase, provide robust web service endpoints to help scientists capture, catalog,
manage, and share data resources. The USGS has a unique opportunity to grow the list of
potential users to the vast group of ArcGIS analysts within the agency. This project seeks to
build a set of support tools that will allow this GIS community to quickly and easily access and
work with geospatial data held by DataONE and ScienceBase, as well as develop repeatable
workflows around these data. The project will develop new access options for ArcMap users,
an easily installed toolbar to take advantage of these new access options, a set of software
documentation and user training materials, and an Open File Report documenting the effort.

                          Total funding amount requested: $30,500
                               Total in-kind funding: $18,150

                                 Specific Datasets Exposed:
            The datasets exposed will be a function of the projects that use these tools

               Geographic/geologic/ecosystem/habitat/taxonomic/other context:
                      All geographic and mission areas; other contexts
                      (keywords and categories) are software-related

                               Type of Product(s) Generated:
        Open File Report, Python Applications Programming Interface, ArcGIS Toolbar,
        Best Practice Documentation, Software Documentation, User Training Materials

                                                                                               1
Summary
 Introduction and Background: In the last several years the USGS has either sponsored or
 partnered with other groups to develop several enterprise data management support systems.
 Two of these projects, DataONE and ScienceBase, provide robust web service endpoints to
 help scientists capture, catalog, manage, and share data resources.

 As these efforts gain acceptance among scientists, the USGS has a unique opportunity to
 grow the list of potential users to the vast group of ArcGIS analysts (~2000) within the
 agency. This project seeks to build a set of support tools that will allow this GIS community
 to quickly and easily access and work with geospatial data held by DataONE and
 ScienceBase, as well as develop repeatable workflows around these data.

 The deliverables from this work will include:
     a python API (Applications Programming Interface) for DataONE, based on existing
         service endpoints available through the project;
     a python API for ScienceBase, based on the existing ScienceBase REST API;
     a python-based ArcGIS Toolbar that can be plugged into a ArcGIS/ArcMap version
         10 client and tied to a VisTrails workflow management instance;
     an Open File Report documenting the effort, including API documentation;
     the posting of all source code and README files on the USGS GitHub space;
     on-line training in the installation and use of the toolbar and use of VisTrails.

 CDI SSF Category: Computational Tools and Support (SSF Category 2)

 Project Title: CDI DataONE and ScienceBase Access Point Expansion: Python Applications
 Programming Interface and ArcGIS Toolkit Development

 Contacts:
     Mike Mulligan, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling,
        Denver CO 80225. Ph. (303) 202- 4242 Email mmulligan@usgs.gov
     Tim Mancuso, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling, Denver
        CO 80225. Ph. (303) 202- 4238 Email tmancuso@usgs.gov

 Developer Resources:
     Brad Williams, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling,
        Denver CO 80225. Ph. (303) 202- 4234 Email bradwilliams@usgs.gov
     Bruce Powell, USGS Core Science Analytics and Synthesis, DFC 6th&Kipling,
        Denver CO 80225. Ph. (303) 202- 4089 Email bpowell@usgs.gov
     Travis Lawall, USGS Fort Collins Science Center, 2150 Centre Ave, Fort Collins, CO
        80526. Ph (970)-226-9341 Email lawallt@usgs.gov
     Sebastien Nicoud, USGS Fort Collins Science Center, 2150 Centre Ave, Fort Collins,
        CO 80526. Ph (970)-226-9145 Email snicoud@usgs.gov

 Collaborating Organizations:
      DataONE
             o Dave Vieglais, Director for Development and Operations
                dave.vieglais@gmail.com
      ScienceBase/CSAS
             o Natalie Latysh, USGS Core Science Analytics and Synthesis, DFC
                6th&Kipling, Denver CO 80225. Ph. (303) 202- 4637 Email

                                                                                                 2
nlatysh@usgs.gov
        Fort Collins Information Science Branch (Web Apps and GIS/RS)
            o Gail Montgomery, USGS Fort Collins Science Center, 2150 Centre Ave, Fort
                Collins, CO 80526. Ph (970)-226-9253 Email montgomeryg@usgs.gov
            o Colin Talbert, USGS Fort Collins Science Center, 2150 Centre Ave, Fort
                Collins, CO 80526. Ph (970)-226-9425 Email talbertc@usgs.gov
            o Laura Smyrl, USGS Fort Collins Science Center, 2150 Centre Ave, Fort
                Collins, CO 80526. Ph (970)-226-4369 Email lsmyrl@usgs.gov

 Detailed description of geographic/geologic/ecosystem/habitat/taxonomic/other context of the
 project and its importance or value if applicable: This project applies to all ArcGIS efforts,
 regardless of geographic reach.

Scope
 In the last two years the USGS has either sponsored or partnered with other groups to develop
 several enterprise data management. Two of these projects, DataONE and ScienceBase,
 provide robust web service endpoints to help scientists capture, catalog, manage, and share
 data resources.

 As these efforts gain acceptance among scientists, the USGS has a unique opportunity to
 grow the list of potential users to the vast group of ArcGIS analysts within the agency. This
 project seeks to build a set of support tools that will allow this GIS community to quickly and
 easily access and work with geospatial data held by DataONE and ScienceBase, as well as
 develop repeatable workflows around these data.

 Being able to work with data is one part of the analysts’ concerns. Capturing the steps
 involved in an analysis for workflow reconstruction and metadata support is also an essential
 part of the equation. To ensure analysts can work with a ready-to-use tool, this project seeks
 to both develop an applications programming interface compatible with ArcMap and to
 encapsulate that API in a toolbar that can be quickly and easily added to the ArcGIS analyst’s
 client installation. As a part of the effort, the toolbar will allow seamless capture of the
 analyst’s workflow by the VisTrails workflow management tool.

 The deliverables from this work will include:
     a python API (Applications Programming Interface) for DataONE, based on existing
         service endpoints available through the project;
     a python API for ScienceBase, based on the existing ScienceBase REST API;
     a python-based ArcGIS Toolbar that can be plugged into a ArcGIS/ArcMap version
         10 client and tied to a VisTrails workflow management instance;
     an Open File Report documenting the effort, including API documentation;
     the posting of all source code and README files on the USGS GitHub space;
     on-line training in the installation and use of the toolbar and use of VisTrails.

 The staff involved in this project has considerable success in developing the original APIs,
 developing workflows with VisTrails, as well as building enterprise connections to ArcGIS.
 This is the correct group to address this critical issue.

                                                                                                  3
Technical Approach
 The technical deliverables from this project, the Python application programming
 interface, ArcGIS toolbar, and VisTrail integration, would be based on existing projects.
 API development would build off the Geo Data Portal API currently available via the
 USGS GitHub space. The ArcMap toolbox/Add-In and a VisTrails package. This would
 open ScienceBase and DataONE to seamless solutions for data and workflow
 management. VisTrails is an open source project that already has connections to
 ScienceBase and DataONE.

 The Python functions would include, but not be limited to:
    SeachWCS(searchTerm, Repo) to return a list of results;
    getMD(DataID, Repo) to return an FGDC metadata record for a single result;
    updateItemMD(DataID, replacementMD as xml file, Repo) to update a metadata record;
    getWMS(DataID, Repo) to return the url to the service generated from a single result;
    downloadLocal(DatasetID, outputFName, Repo) would save a local copy of a dataset;
    uploadLocal(localFName, RepoFName, optional Username, optional UserPassword,
      Repo) would upload a local file to the target server.

These capabilities in other forms are already supported through system service endpoints.
ScienceBase and DataONE both have production published REST services that are being used by
a variety of groups. For the most part, these uses are focused on delivering content to web portals
and desktop modeling/data processing systems. The use of these services by ArcGIS users has
generally been focused on search and display of data resources. This new approach will focus on
data and metadata submission, as well as building derivative products from delivered data.

API development would be implemented as an ArcGIS Toolbar. This approach has been
successful in other projects, including work by this project group on delivering and modeling
phenology data housed by ScienceBase. Toolbar development would focus on the ArcGIS 10.x
architecture. The source code would be exposed via GitHub, providing a way for future efforts to
build on the initial development work.

The use of VisTrails as a part of the technical stack is based on the increased use of this open
source project for workflow capture and recreation. VisTrails provides an extensible way for
ArcMap/ArcGIS users to build process metadata as part of product construction. This would be a
recommended but optional component in the full package; ScienceBase and DataONE would be
adapted to accept VisTrails inputs for those users who want to use the full technical stack.

The vast majority of funded work would be centered on the Python API, ArcGIS Toolbar, and
VisTrails integration. As the development team exercises the ScienceBase and DataONE REST
services, we expect that these projects will have to make nominal changes to their APIs. These
changes will be contributed as in-kind work. Also, CSAS and Fort Collins will contribute their
full development environment as in-kind equipment/supplies.

Project Experience
 The principle investigators are well versed in all aspects of this project (API development,
 python, ArcGIS Toolbar implementation, VisTrails workflow management). CSAS staff

                                                                                                  4
have been involved in the DataONE project since it’s inception, the product owners of the
 ScienceBase project, and are key players in GIS tool development. The proposed CSAS
 contingent of cooperators and contract staff have successfully supported core CSAS data
 systems (e.g., IT IS, GAP, VCP, NFHAP, MARIS, OBIS). Fort Collins Science Center staff
 has provided support development for ScienceBase and a number of GIS tool products, have
 contributed to the VisTrails open source effort, and have developed APIs for a number of
 projects.

Commitment to Effort
 Core Science Systems has invested heavily in the support of ScienceBase and DataONE. It is
 reasonable to expect CSS will continue to support both projects, either as a sponsor or a
 project collaborator. The lessons learned through this exercise will be applicable to similar
 efforts (API development and a reference implementation of the API in a client system).

 This proposal includes contract staff for support development. The COR for each respective
 organization has been contacted and the COR has approved use of the contract vehicle to
 support this effort.

Budget

  Budget Category             Federal Funding “Requested”            Matching Funds “Proposed”

 1. SALARIES (inc. number of hours and hourly rate):
  Federal Personnel        $                                         $
  Colin Talbert, 20 hrs                                              $1,200
  Gail Montgomery, 40 hrs                                            $2,200
  Laura Smyrl, 40 hrs                                                $2,500
  Mike Mulligan 20 hrs                                               $1,000
  Tim Mancuso 25 hrs                                                 $1,250
  Bruce Powell 40 hrs                                                $2,000

  Contract Personnel                                                 $
  Contract Staff (300 hours   $21,000
  @$70/hr)
  Contract Staff (150 hours   $7,500
  @$60/hr) CSAS

 2. FRINGE BENEFITS: N/A
  Personnel              $                                           $
                         $                                           $
                         $                                           $
                         $                                           $

                                                                                                 5
Contract Personnel          $                                 $
                              $                                 $
                              $                                 $
                              $                                 $
  Total Fringe Benefits:      $                                 $

 3. TRAVEL EXPENSES*: N/A
  Per Diem                $                                    $
  Airfare                 $                                    $
  Lodging Cost            $                                    $
  Vehicle Cost            $                                    $
  Mileage                 $                                    $
  Other travel expense(s) $                                    $

  Total Travel Expenses:      $                                $

 4. OTHER DIRECT COSTS: (itemize)
  Equipment (inc. software,
                            $                                   $ 8,000
  hardware, etc.) –
  Development
  Environment
  Supplies                    $                                 $
  Training                    $                                 $
  Publications                $                                 $
  Office supplies             $                                 $
  Communications Cost (OFR $2,000                               $
  Publication through SPN)
  Total Other Direct Costs    $2,000                            $

  Total Direct Costs:         $ 30,500                          $ 18,150
  Indirect Cost (%)           $                                 $

  GRAND TOTAL:                $30,500                           $18,150

Timeline

 Deliverable                                                Estimated Delivery Date
 Development of Python API for ScienceBase                  8 weeks from time of award
 Addition of DataONE Member Node to Python API              12 weeks from time of award
 Development of ArcGIS toolbar that uses Python API         16 weeks from time of award
 Development documentation and user manual                  20 weeks from time of award
 Establish Training Schedule for Toolbar installation and   20 weeks from time of award
 use; test training with initial sets of users
 Open-File Report                                           24 weeks from time of award

                                                                                          6
Appendices

No attachments or appendices

                               7
You can also read