Towards High Quality Administrative Data - A Case Study: New Zealand Police

 
Towards High Quality Administrative Data - A Case Study: New Zealand Police
Towards High Quality Administrative Data - A Case
                 Study: New Zealand Police
                                           Gavin M. Knight
                                          New Zealand Police

__________________________________________________________________
This report was commissioned by Official Statistics Research, through Statistics New
Zealand. The opinions, findings, recommendations and conclusions expressed in this
report are those of the author(s), do not necessarily represent Statistics New Zealand
and should not be reported as those of Statistics New Zealand. The department takes no
responsibility for any omissions or errors in the information contained here.

Citation: Knight, G. (2008). Towards high quality administrative data – A case study: New Zealand Police,
The Official Statistics System, Wellington, Official Statistics Research Series, Vol 3
ISSN 1177-5017
ISBN 978-0-478-31514-1 [Online], available: www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Abstract
Much had been written about principles and standards for designing surveys to
ensure good quality statistical information results. Less has been written about
standards for administrative data. Whereas it is thought that many of the same
principles may apply, the terminology is often different and contextual differences
exist that may require changes in the form, if not the substance design-
standards. For example, a survey questionnaire is usually designed to be
completed by a sampled respondent just once, whereas a form used to capture
data for an operational IT system may be filled out many times a day by the same
person in order to record information required by that person to perform their job.
Efficiency and relevance may therefore have different implications for the design
of such forms. This paper documents a project undertaken as a case-study on
New Zealand Police that sought to identify principles to assist with designing
good quality administrative data. Recommendations are made, based on these
principles.

Keywords: Administrative data, quality, form, New Zealand Police

Official Statistics Research Series, Vol 3, 2008                                             2
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Contents
The main body of this paper describes the project, its methodology, and a summary of
results and conclusions.

Background..................................................................................................................... 4
Methodology ................................................................................................................... 4
Phase I: Business Processes.......................................................................................... 5
   Description of Phase I ................................................................................................. 5
   Results of Phase I ....................................................................................................... 6
Phase II: Design-Principles ............................................................................................. 7
   Description of Phase II ................................................................................................ 7
   Results of Phase II ...................................................................................................... 8
Recommendations .......................................................................................................... 9
   Process and accountabilities ....................................................................................... 9
   Design-rules for guardians ........................................................................................ 10
   Ensuring compliance ................................................................................................. 10
Next Steps .................................................................................................................... 11
Conclusions .................................................................................................................. 12
Appendix 1: Summary of focus-group workshops .........................................................13
Appendix 2: Design-rule recommendations...................................................................22
Appendix 3: Integrationworks' report at conclusion of Phase I.......................................25

Official Statistics Research Series, Vol 3, 2008                                                                              3
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Background
This project was undertaken between November 2006 and August 2007, as part of the
Official Statistics Research programme administered by Statistics New Zealand.
Principle partners in this project were New Zealand Police (NZ Police), which was
responsible for leading the research and was the subject of the case-study, and
Statistics New Zealand (Stats NZ), which not only contributed significant funding to the
project, but also provided a 'project ownership' role to which Police were accountable for
reporting and implementing the project as planned.
NZ Police, Stats NZ and other organisations identified later in this paper, contributed to
the research itself.
The project aimed to produce a standard, consisting of principles and rules, as
appropriate, that will be applied by NZ Police to ensure capture of quality statistical
information in administrative systems, and which is sufficiently generic that it could
potentially be applied by other agencies.
Note: This paper uses the terms 'standard', 'principles', and 'rules' as interrelated
concepts, often preceded by the word 'design'. When the word 'standard' is used as a
noun in this paper, it should be interpreted as a combination of 'rules' and 'principles'. It
may be thought of a concept; not necessarily a specific tangible document. Rules, on the
other hand, must be explicitly documented.

Methodology
The project took the form of a case study on NZ Police, with two phases. The first phase
aimed to understand the current business processes where decisions are made about
design of forms and IT applications then, from this understanding, identify where in these
processes design-principles should be applied. The second phase aimed to determine
what these design-principles should be.
An experienced project team was formed, whose makeup varied throughout the project,
adapting to the requirements of the current stage of work. However, for continuity, two
members of the team were involved from start to finish, being Gavin Knight from Police
National Headquarters (PNHQ) and Simon Thomson from Stats NZ's Collection and
Classification Standards unit.
The lead for Phase I was contracted out to Integrationworks - a firm specialising in data,
system and application integration. Integrationworks reported to the project team, which
acted as a steering group. This phase involved interviewing Police staff involved in
changing forms, IT applications and business processes. It also reviewed police
documentation about the business processes relevant to these functions.
Phase II was lead by Police, who facilitated a number of workshops, as a form of focus-
group, involving practitioners from various government agencies who work with
administrative data. This group, informed by existing literature and the results of Phase I,
discussed practical implications and issues. It considered options for addressing these
through an administrative data design-standard (consisting of principles and rules), and
formed, by consensus, a view of what this standard should be.

Official Statistics Research Series, Vol 3, 2008                                             4
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Phase I: Business Processes
Description of Phase I
Phase I commenced in November 2006 and concluded in March 2007.
Initially in Phase I, the project team consisted of:
          •       Gavin Knight, National Statistics Manager, PNHQ, NZ Police
          •       Fiona Morris, Performance Officer, PNHQ, NZ Police
          •       Simon Thomson, Statistical Analyst, Collection and Classification
                  Standards, Stats NZ
          •       Bridget Murphy, Justice Subject Matter Project Manager, Social
                  Conditions, Stats NZ
          •       Barb Lash, Statistical Analyst, Social Conditions, Statistics New Zealand
PNHQ project team members, assisted by members of Police's Information Technology
Communications Service Centre (ICTSC), selected 'Integrationworks' from three
prospective IT consulting firms, to take the lead in Phase I.
Nick Borrell from Integrationworks interviewed eighteen Police Subject Matter Experts
(SMEs), identified by PNHQ members of the project team. These SMEs included a mix
of sworn and non-sworn police staff and represented a variety of roles within Police,
including:
         •        Systems analysts
         •        Business analysts
         •        Project managers
         •        A file centre manager
         •        Various IT managers
         •        Area commanders
         •        An Area tactical response manager
         •        Intelligence section supervisors

Most SMEs interviewed were of middle-management in seniority, ranging from Sergeant
to Inspector in rank or rank-equivalent (non-sworn staff).
In selecting SMEs to interview, Police took into account who had been involved in
requesting or managing changes to forms, IT applications and business processes,
either regularly, or in recent projects or initiatives.
In addition to interviewing SMEs, Nick Borrell reviewed existing Police documentation
relating to the business processes involving making changes to forms and IT
applications.
The project team acted as a steering group for Phase I, providing direction and feedback
to Nick Borrell, as the project progressed. It also assisted with answering questions and
removing roadblocks, such as facilitating access to SMEs.

Official Statistics Research Series, Vol 3, 2008                                             5
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

By the end of Phase I, project team makeup had altered slightly, as Stats NZ sought to
provide the appropriate expertise to provide feedback to Nick Borrell and review Nick's
draft report. In particular Stats NZ replaced Bridget Murphy with Liping Jiang, Subject
Matter Project Manager, Collection and Classification.

Results of Phase I
Nick Borrell submitted his final report on 22 March 2007. This report described the
process undertaken, documented findings, and made a number of recommendations,
primarily relating to the business processes by which forms and IT applications are
designed.
Nick's report is attached as Appendix 3. Key findings included:
         •    No framework exists to standardise and manage data capture,
         •    There are no resources (e.g. guideline manuals) available to staff for
              addressing data capture,
         •    The existing process to manage form-changes does not consider data quality
              or standards,
         •    The process to manage forms can be circumvented by staff - leading to
              unauthorised changes,
         •    Staff do not appear to know how to initiate changes to forms or policing
              procedures, and
         •    ICTSC is perceived as the de-facto owner of all data quality issues, yet
              resolving issues of statistical information quality is not a core function of
              ICTSC.
The report also made a number of recommendations aimed at addressing the key
findings in a way that minimises the barriers to implementing change, by avoiding
significant process reengineering. Instead, the report takes into account existing
business processes and functional groups. It recommends the minimum modification
necessary to existing processes and work-group functions to effect required
improvements.
It is acknowledged that this tactical approach applied to other organisations may result in
different business processes. However, there is a tension that needs to be balanced
between achieving buy-in of an organisation to making change, and creating a business
process that was common to all organisations. The latter was viewed as unrealistic and
not necessarily desirable anyway. Whereas principles for data quality may be common, it
may be appropriate for different types of organisations to have different business
processes.
Phase I therefore made recommendations about distinct features of an effective process,
rather than simply recommending a specific process. Such features may have greater
relevance to other organisations than the Police-specific processes that are
recommended.
The key recommendations in Integrationworks' report are:

Official Statistics Research Series, Vol 3, 2008                                              6
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

         •        Create a data quality framework which supports guidelines and principles
                  for data quality,
         •        Establish form-change guardianship,
         •        Appoint data quality guardianship to IT applications,
         •        Develop design-standards for data to be captured,
         •        Bring ICTSC processes in line with design-standards, and
         •        Develop policy to support design standards.
These are detailed more fully in Appendix 3. However, in short, they involve
standardisation of processes, creation and enforcement of design-standards, and
allocation of responsibility for application of standards when changes to forms and IT
applications are proposed by the business.
Phase II considered what these design-standards should be.

Phase II: Design-Principles
Description of Phase II
Phase II commenced in April 2007 and concluded in August 2007.
The project team for Phase II consisted of:
          •       Gavin Knight, National Statistics Manager, PNHQ, New Zealand Police
          •       Chris Worsley, Statistics Business Analyst, PNHQ, New Zealand Police
          •       Simon Thomson, Statistical Analyst, Collection and Classification
                  Standards, Statistics New Zealand
          •       Matt Flanagan, Statistical Analyst, Collection and Classification
                  Standards, Statistics New Zealand
          •       Barb Lash, Statistical Analyst, Social Conditions, Statistics New Zealand
          •       Robyn Smits, Manager, Data Management Unit, Ministry of Education
          •       Dr. Karolyn Kerr, Manager Information and Analysis, Central Region
                  Technical Advisory Services (Health)
          •       Jason Gleason, Senior Data Analyst, Justice Sector Information Strategy,
                  Ministry of Justice.
Additionally, Ian Smith, Police's National Applications Manager, and Senior Sergeant
Bernie Geraghty, Police's National Coordinator of Business Analysts, attended one
meeting. Senior Sergeant Geraghty subsequently continued to provide feedback on
notes from workshop meetings.
Informed by the Phase I report, which identified how a design-standard would be used,
project team members from Statistics New Zealand collated documents containing
principles and standards which it was thought might usefully inform the development of
principles and design-rules for administrative data. These documents included:

Official Statistics Research Series, Vol 3, 2008                                             7
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

         •    "Quality Protocols" of the (New Zealand) Official Statistics System, produced
              by Statistics New Zealand.
         •    "A Guide to Good Survey Design" ( July1995), produced by Statistics New
              Zealand, ISBN: 0-477-06492-2
         •    "Best Practice Guidelines for Classifications", used by Statistics New
              Zealand's 'Classifications and Standards' unit
         •    "Official Statistics System Administrative Data Guidelines"
         •    "Draft principles for designing forms, processes and IT applications to ensure
              desired statistics have acceptable quality" (August 2006), a desk-file used by
              the Statistics Unit at PNHQ
It was apparent to the project team that the principles in most of the above documents
had been developed from the context of surveys. In a couple of instances there had
been an attempt to adapt principles developed for surveys to administrative data.
However, gaps remained.
Having collated and reviewed the above documents, principles from them (many of
which appeared in more than one document) were explicitly identified and discussed by
the project team, in terms of their relevance to administrative data.
Project team members were asked to consider both these principles and gaps in the
principles, based on their experience in working with administrative data.
Discussion occurred in six workshops, occurring over a five-month period. Notes were
taken at the workshops, particularly concerning conclusions and the associated
rationale. These notes were reviewed by workshop participants between meetings.
There were no a-priori assumptions of validity or intrinsic merit of any suggestions
expressed by team members in the workshops. Team members were asked to consider
all information presented, taking into account both the consistency with their own
experiences and the soundness of the rationale behind ideas.
The result, which is documented in Appendix 1, should therefore not be treated as
empirical, but should be treated as expert opinions that have survived peer-review in a
focus-group context.
The project sponsor (Police) acknowledges the willingness of project team members to
participate in such an exercise, where opinions were challenged in an effort to achieve a
robust result. In general, project team members felt that the result was superior to what
could have been produced by any individual, with ideas from one team member
prompting ideas in others.

Results of Phase II
Notes from the discussions are attached as Appendix 1. A summary of key conclusions
is as follows:
Many of the principles that are applicable to design of survey data are applicable to
administrative data as well. However, sometimes a slight change in terminology is
needed, to reflect the different context.

Official Statistics Research Series, Vol 3, 2008                                             8
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Ensuring good administrative data requires some additional components which are
either not relevant to or manifest differently from survey data. These include preventing
'work-arounds' by staff wanting to report statistical information without complying with
rules, and the need to relate level of detail in a classification to the level of detail required
by the underlying operational process generating the data.
The Phase I report (Appendix 3) proposed specific processes for developing forms and
IT applications. In Phase II, it was identified that the process of developing IT
applications typically produces increasing design-detail as the analysis and design
phases of IT projects progress. Not all of the factors necessary to determine alignment
with the principles and design-rules may be evident at the outset. The implication is that
the checking of a proposed design may be more iterative than indicated in the process
diagram suggested by the Phase I report.
As identified in Phase I, to effect quality statistical information from administrative data,
staff require resources (in the form of processes, practices and design guidelines)
relevant to their roles. Generic quality principles are not as useful as principles relevant
to the task at hand. The implication, recognised in Phase II, is that we must identify the
tasks and develop resources for each task. Such resources include:
         •    A template and guideline for developing a proposal for a new form or
              modification to a form or IT application. (proposer)
         •    A manual for checking alignment of a proposal with the design-rules.
              (guardian)
         •    Documentation of the business process(es) by which new and modified forms
              and IT applications are made.
         •    A data dictionary for the organisation.
         •    Policy.

Recommendations
Process and accountabilities
    1. Create a documented standard process for processing proposed forms or
       modifications to forms and IT applications.
    2. Create policy that assists and ensures compliance with this process.
    3. Appoint a person or group in the organisation as 'guardian' in the process for
       creating or modifying forms and provide this guardian with resources such as
       training and/or guidelines that incorporate design-principles and rules. The
       guardian's role is to consider proposed new forms or modifications to forms
       against these design-principles and rules, provide feedback and suggested
       changes to the proposer, and to ensure proposals do not proceed until and
       unless they comply with the design-rules.
    4. Appoint a person or group in the organisation as 'guardian' in the process for
       creating or modifying IT applications and provide this guardian with resources
       such as training and/or guidelines that incorporate design-principles and rules.
       The guardian's role is to consider proposed IT application designs against the

Official Statistics Research Series, Vol 3, 2008                                              9
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

         design-principles and rules, provide feedback and suggested changes to the
         proposer, and to ensure proposals do not proceed until and unless they comply
         with the design-rules.
              •     There may be a separate IT application guardian for each IT application,
                    or one group may take this responsibility for many or all IT applications.
                    The guardian may or may not be the same as the guardian for forms.
              •     For NZ Police, the guardian for forms should be the Applications Support
                    group in ICTSC, as this would require the minimum change from existing
                    accountabilities. The guardian or guardians for IT applications should be
                    determined by the Manager ICT. However, to avoid overlapping
                    accountabilities, as a minimum, there should be no more than one
                    guardian for any given IT application.
    5. Appoint a person or group in the organisation to have responsibility for
       developing, maintaining and communicating 'design-rules for administrative data'
       that ensure quality statistical information.
              •     For NZ Police this should be the National Statistics Manager in PNHQ, as
                    this role best aligns with existing competencies.
    6. Have a standard 'change-request' form that accompanies proposals for new
       forms or modifications to forms or IT applications. The change-request form is
       distinct from any business case justifying and seeking approval for the change.
       Rather, its purposes are to (a) prompt the proposer to identify and consult with
       stakeholders, and (b) capture the information required by the guardian to assess
       the proposal against the design-rules and provide feedback or suggest changes
       in order to comply.
    7. Create a data dictionary for the organisation, as a central register of variables,
       containing their labels, definitions, formats, scales, ranges, and instructions to be
       given to people capturing data.
              •     The data dictionary should be made as accessible as possible on-line to
                    all staff in the organisation (E.g. Police).

Design-rules for guardians
    8. A manual should be created to assist guardians in checking proposals against
       design-rules.
              •     The manual should incorporate the recommendations in Appendix 2.

Ensuring compliance
    9. Policy should be created to assist and ensure compliance with these
       recommendations
    10. IT designers should endeavour to prevent statistical reporting of data that has
        been excluded from the data to which the design-rules have been applied.

Official Statistics Research Series, Vol 3, 2008                                                10
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

              •     For example, for NZ Police this may involve ensuring all statistical
                    reporting occurs via the data warehouse, rather than directly from
                    proprietary operation IT systems. It may also require that 'Business
                    Objects' universes be designed to support the 'design-rules' and not
                    report other data in a form that facilitates statistical reporting.
              •     One mechanism for controlling this is through the governance processes
                    for approving expenditure. For example, the organisation should consider
                    how payments are currently approved to IT system suppliers for
                    developing reports or reporting capability on their systems.
              •     One type of change-request may involve making data from an operational
                    IT system conform with the design-rules, thereby improving its quality and
                    enabling statistical reporting from it.
    11. IT designers should endeavour to prevent creation of or modification to database
        fields that bypass guardians and/or do not comply with the design-rules.
    12. The Assurance group at PNHQ should incorporate auditing of new or newly
        modified forms and IT applications into its audit framework, to check for
        compliance with policy.

Next Steps
    13. Statistics NZ should consider the potential for developing a guideline/manual for
        government agencies, incorporating principles recommended by this project.
    14. Police's National Statistics Manager should, in consultation with ICTSC, develop
        the recommended manual for guardians and the template and guidelines for
        proposing new forms or modifications to forms and IT applications.
    15. Police's ICTSC should, in consultation with the Statistics Unit in Organisational
        Performance Group PNHQ, document a standard process for creating and
        modifying forms and IT applications.
    16. Police's Applications Support group should, in consultation with the Statistics Unit
        in Organisational Performance Group PNHQ, develop an initial data dictionary
        and a manual for its use, then integrate maintenance of the data dictionary into
        the above new standard process.
    17. The Manager ICT should allocate guardianship responsibilities to staff, in line
        with the recommendations of this project.
    18. Police's Policy unit should draft, consult, and seek appropriate approval for policy
        that supports these recommendations.

Official Statistics Research Series, Vol 3, 2008                                                11
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Conclusions
This project, although not empirical, has provided a valuable step forward in
understanding both design-principles for administrative data, and issues that require
addressing in order to introduce such principles into the business.
The project did not examine in any detail what constitutes 'quality' in administrative data.
Rather, using 'quality' principles already identified in the literature, the project looked at
how these manifest in an administrative data context and considered design-principles to
address them.
Many of the principles applicable to design of survey data are also applicable to
administrative data. However, the context of creating survey data differs markedly from
that of creating administrative data. This leads to a need for creation of resources in a
variety of forms, in order to effect good quality statistical data. These resources include:
    •    Standard documented processes,
    •    An organisation-wide data dictionary,
    •    Allocation of defined roles and responsibilities,
    •    Documented guidelines for staff to undertake these responsibilities,
    •    Organisational policy to assist and ensure compliance, and
    •    Audit, to ensure compliance.
Although undertaken as a case study on New Zealand Police, most results - at the
principles and rules level - do not appear to be uniquely applicable to Police, but would
be of more generic applicability to any organisation producing administrative data.
Results are therefore encouraging.

Official Statistics Research Series, Vol 3, 2008                                             12
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Appendix 1: Notes from workshops in Phase II
1. Aims of this policy
A policy is required that assists and ensures Police staff create and modify forms and IT
application in a standardised way that incorporates principles which ensure quality
statistical information can be produced.

Determining content is not in scope for the policy/guidelines
This policy affects front-end design; not IT system data architectures or back-end
extraction tools and standards. These are also required to deliver quality statistical
reporting, but are out of scope for this particular policy.
This policy does not dictate what information Police must record; neither does it dictate
what recorded information will be used to generate statistical information. Instead, once
the business has made these decisions, this policy affects aspects of how forms and IT
application front-ends will be designed.
That said, it is acknowledged, that unless centralised control exists over the process by
which decisions are made about the design of forms and IT applications, standards will
be impossible to apply in practice. Therefore, this policy must include relevant aspects of
business process.
As part of this, consideration must be given to whose responsibility it is to determine
what statistical information is required from a particular form or IT application
development. Options considered, along with their strengths and weaknesses were:

       Option                           Strengths                                 Weaknesses
1. Gatekeeper/             •    Centralised role who is             •   Places much higher responsibility on
guardian                        involved in the process at the          guardian than envisaged.
                                appropriate stage.                  •   Guardian is a facilitator, rather than
                                                                        a business owner.
2. Proposer                •    Likely to come from the             •   Involvement in this process is one-
                                business area driving the               off, so is unlikely to have expertise
                                change.                                 in identifying stakeholders and
                                                                        determining information needs.
3. Business owner of       •    Will be a key stakeholder for       •   May not appreciate broader
the functional area             the affected data and likely to         synergies, information management
that creates the data           understand the business                 implications and stakeholder needs.
                                implications in that area.
4. Business owner of       •    Will understand the directly        •   IT groups should have a capability
the IT application              affected system.                        responsibility, rather that business
                                                                        ownership of the data.
5. A centralised non-IT    •    Is centralised and is likely to     •   This function may not be consistent
business information            understand statistical issues           with the core function of the group.
group (E.g.                     and how to identify breadth of
Headquarters                    information needs.
Statistics Unit)

Additionally, wherever the function sits, there may be resourcing implications.

Official Statistics Research Series, Vol 3, 2008                                                                 13
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Conclusion: It is not necessary for one group to carry this responsibility alone. Instead, a
standardised 'change-request' form should be created for use at step 2 of the processes
recommended in sections 3.2 and 3.3 of the Integrationworks report (Appendix 3).
Responsibility for completing the change-request form should be with the proposer.
However, the form should guide the proposer to identify stakeholders and ensure they
are consulted. Such stakeholders will be a mix of operational business managers and
centralised expert groups. E.g. District Commanders, National Statistics Manager and
the Manager Applications Support.
On certain key projects, it may be appropriate for one of these stakeholders to take over
responsibility for proposing the change.
Such an approach leaves responsibility for doing the leg work with the proposer, but
ensures input from a centralised group who can apply expertise that the proposer does
not necessarily have.

So what is in-scope? - Designing the capture of what is recorded
Our aim is to ensure the quality of whatever statistical information Police desire to be
produced by:
    •    Requiring that any proposed creation or change of a Police form be considered
         by a 'gate-keeper' or 'guardian' with the responsibility and knowledge to ensure
         the form is designed in such a way that resultant statistical information will be of
         good quality.
              o     The project team notes that what is meant by 'quality' requires definition.
                    However, such a definition may not differ markedly from existing
                    documented definitions regarding survey data, such as are contained in
                    the documentation reviewed.
    •    Requiring that any proposed creation or change of a Police IT application be
         considered by a 'gate-keeper' or 'guardian' with the responsibility and knowledge
         to ensure the IT application will capture information in such a way that any
         statistics Police desire be produced from this system will have acceptable quality.
    •    Ensuring that these gatekeepers ask the following questions:
              o     "Of all of the information that is proposed to be recorded on the form/in
                    the IT application, what subset of this information do Police require
                    statistics to be derived from?" (the statistics subset), and
              o     "Is this subset sufficient to provide the desired statistical information?" (If
                    not, either the statistics subset needs to be extended, additional
                    information needs to be recorded or, where not possible, it needs to be
                    accepted that not all of the desired statistical information can be produced
                    from this particular form or IT application.)
                             Note: It is anticipated that most data recorded will, by default, be
                              part of the statistics subset unless specifically excluded.
                              Subsequent discussion will therefore speak as if all data is
                              included in the statistics subset. It is noted here, however, that

Official Statistics Research Series, Vol 3, 2008                                                     14
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

                            design-principles applied to this data will not necessarily be
                            relevant to data specifically excluded from the statistics subset.
     •    Preventing data that has been excluded from statistics subset from being
          extracted from Police systems in a way that enables such data to be used from
          calculating statistical information. E.g. A free-text field used for capturing rough
          notes may be excluded from the statistics subset. This field may be extracted to a
          data warehouse along with other data and may be queried, but it may not be
          used to select records for aggregation, such as in a 'Condition' statement in SQL.
          (This stops work-arounds that would jeopardise quality by avoiding proper
          design.)
     •    Providing guardians with design-rules that ensure forms and IT applications
          capture data in such a way that good quality statistical information will result.
Note: This policy alone cannot ensure quality. It will do so as far as the design of forms
and IT applications is concerned, however to complement this policy, three further
aspects are required:
     •    policy is required to govern staff recording practices,
     •    data storage, transformation and access mechanisms (I.e. the IT back-end) need
          to be designed appropriately, and
     •    correct practice throughout the system needs to be monitored in a performance
          management framework.
These components are addressed through other aspects of Police's Statistics Strategic
Plan.
Furthermore, some important existing Police systems have already been built in a way
that would not comply with this new policy. As result, statistical information available
from these systems is inferior in scope and quality than desired. Projects to
retrospectively redesign aspects of these systems would be required if these limitations
are to be addressed.
This new policy will, however, prevent a repeat of bad design practices on these legacy
systems.

2.       Design Rules (drafted as if for guardian)
     Full vs partial modification
When only a part of a form or IT system is being modified, a decision is required on
whether to limit the application of these design-principles to just that part or whether to
take the opportunity of reviewing the whole form, IT application, or relevant module of
the IT application.
Whereas any change to a form or IT application provides an opportunity to address a
number of issues at the same time, a comprehensive change may require extensive
investment that may not be warranted by the desire for a minor improvement in a form.

Official Statistics Research Series, Vol 3, 2008                                                 15
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Four distinct issues need to be considered:
    •    implications for the form/screen layout,
    •    implications for other systems that might use the same variable(s),
    •    implications for time-series, and
    •    operational implications.

Implications for the form/screen layout
This is the simplest impact to assess. The main requirement is that any changes must be
consistent with the rest of the form and avoid confusion. E.g. another part of the form
may refer to a field in the affected portion.

Implications for other systems
This may be more difficult to assess, unless some mechanism to link and/or match fields
in different systems exists.
If affected variables are contained in the organisation's data dictionary and if this data
dictionary identifies all forms and systems using the same variable, this will make the
task easier.
Conclusion: A data dictionary should be created for the organisation. (Refer section 3.3
for details).

Implications for time-series'
Altering variables or classifications, can interrupt time-series'. For example, mandating
entry of what was previously an optional field, introducing a new category, obsoleting an
existing category, or changing whether or not 'don't known' or 'missing data' is permitted.
Similarly, even altering how information is prompted can effect what is entered.
Literature on design of survey measuring instruments makes this clear. In the absence of
evidence to the contrary, we should not assume this principle differs for administrative
data.
Changes should therefore only be made if important and, wherever possible, where the
impacts on time series' has been analysed.
Metadata accompanying resultant statistical information should note changes that
potentially affect time-series'.

Operational implications
Will the change being proposed for the form/IT application work for the business,
including all affected stakeholders? For example, if a new field/variable is proposed, is it
operationally appropriate to collect this information at the stage of the business process
that the data is being recorded, or if a new category is being proposed, does that
category have operational relevance?

Official Statistics Research Series, Vol 3, 2008                                             16
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Conclusion
The decision about whether to make a partial or full modification when a change is
proposed, should be made on a case-by-case basis. The policy should recognise that
judgment needs to be applied. However, it should always require consideration of the
above four types of implications and, where appropriate, provide documentation (E.g.
metadata, training materials, etc.).

Statistical units vs attributes
In short, both may be variables, appearing as fields in forms and data entry screens.
However, statistical units are what is counted and attributes are the descriptors of what
is counted, and can be used to select what to count. (E.g. in 'Condition' statements of
SQL queries.)

Statistical units (or 'measures')
Considering the information to be collected on a form or in an IT system record, identify
the desired statistical units. I.e. what is to be counted.
Statistical units can be considered as being one of two types: 'Direct' and 'Derived':
Direct: These will typically be the 'record' (E.g. a form) that corresponds to the action or
        object to which the form relates. For example, a transaction, occurrence, property
        item, person, etc.
Derived: These do not necessarily have a one-to-one relationship with the core form or
       record being captured in an instance. Derived measures need to be considered
       carefully, firstly to determine feasibility and how to derive desired measures that
       are not one-to-one with records, and secondly to be sure that the measure's
       definition is valid.

         An example of a derived measure would be where we wish to count the number
         of children present at domestic violence incidents. A single record per incident is
         created from the relevant form (for Police, this is the POL400 form). However,
         this form contains fields specifying the number of children present at the incident.
         So it is possible to count the number of children present at domestic violence
         incidents.
From a statistical perspective, where it is possible to obtain statistical units both directly
and derived, it is preferable to obtain them directly. (E.g. a separate form for each person
present at an incident, rather than a count of such people on the incident report.)
However, this needs to be balanced against respondent burden, firstly through requiring
additional information be recorded that is not required for operational purposes, and
secondly through double-entering information that may already be recorded elsewhere in
the system.

Attributes
Identify the factors that will be used to select and characterise statistical units. These are
known as 'attributes'. Typical attributes may be:

Official Statistics Research Series, Vol 3, 2008                                             17
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

     •    time periods,
     •    geographical areas,
     •    categories of measures, such as job-type, ethnicity, gender, age, cost-centre,
          job-closure category, etc. or
     •    numerical values that describe the measure object, such as age, height, weight,
          etc.
Different attributes may have different formats, such as categorical, numeric, date, etc.
Note: Unique identifiers of individual records (E.g. a file number) are not statistical
attributes, as they only apply to a single record and do not characterise measures.
Once statistical units and attributes have been decided, determine the hierarchical
relationships. For example, one Occurrence may contain one or more Offences, for
which there may be none, one or many Apprehensions of offenders. Similarly, one
District may contain a number of Police Areas which may contain a number of Police
Stations.

Check fitness-for-purpose
Design the change-request template indicated in sections 3.2 and 3.3 of the Stage 1
report to provide all of the information required by the guardian. For example, it should
work through all fields in the proposed form or IT application input screens, identifying
which fields are to be excluded from the statistics subset. For all other fields, suggested
labels, definitions, variable-type, scale and range, should be included in the template.
The template should also identify stakeholders of the relevant statistical information and
have their sign-off. Such sign-off is to mean that the stakeholder is satisfied that the
statistics subset is sufficient to provide all of the statistical information they expect from
what is collected on that form or IT application input screen.
In deciding how much data to record, balance respondent burden with desired
information. Also, design the structure of any data recoded to maximise the potential for
integrating other data and deriving additional information.
The template must make it clear to the proposer and stakeholders that it will be
impossible to produce statistics using excluded fields.

Design of variables
Check all of the variables in the proposed form or IT application against the
organisation's data dictionary.
This data dictionary is a central register of variables, containing their labels, definitions,
formats, scales, ranges, and instructions to be given to people capturing data.
The data dictionary is not specific to IT systems; rather it defines variables that may
appear in any system. It is important to retain the distinction between variable definitions,
which are about information, and IT systems, which are about technology.

Official Statistics Research Series, Vol 3, 2008                                             18
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

If a proposed variable is the same or similar to an existing variable in the data dictionary,
where such similarity is considered adequate for operational purposes, the existing
variable should be used in preference to creating a new one.
In general, use a structured format in preference to free text, wherever possible.
Wherever the guardian proposes a change from the proposal, this should be worked
through with the proposer and, in turn, stakeholders identified on the template.
Variables on different systems with the same label must be fully compliant with the
specification in the data dictionary, including definitions, categories, etc.
Where no suitable variable already exists in the data dictionary, consideration should be
given to external standards that Police may desire to be compatible with. For example,
we may wish to compare crime statistics with Australia. Data dictionaries for the
Australian Standard Offence Classification (ASOC) and the Recorded Crime Victims
Statistics (RCVS), should be considered.
Where no such explicit standard applies, the Justice Sector data dictionary should be
consulted and, failing that, Statistics New Zealand should be consulted, in order to
identify a potentially suitable existing variable.
Where none exists, a new variable should be created and entered into the data
dictionary, with all required information about it completed. This new variable must have
a unique label - one that is not already in use elsewhere. Subject to this limitation, the
label should be as intuitive as possible.
The data dictionary should be made as accessible as possible on-line to all staff in
Police.
Free-text format fields should only be used in the statistics subset as a last resort. Where
they are included, comprehensive instructions must be provided, specifying how to use
these fields, and coding schemes must be created to encode data in them for the
purpose of producing statistical information.
Note: The contents of excluded fields may be reported as qualitative information, but
excluded fields may not be used to determine which records to aggregate when
calculating statistics. (E.g. they are not to be used in 'Condition' statements of SQL
queries.)

Design of classifications
Statistics New Zealand's 'Best Practice for Classifications' should be referred to and
complied with when designing categorical variables in forms and IT applications.
Define 'classification'.
Some specific requirements include:
For categorical variables, categories must be mutually exclusive and exhaustive.
For each categorical variable, a flat or hierarchical classification structure must be used.
"A flat structure should be used when a simple listing is required or when there is no
requirement to aggregate or group categories into more meaningful categories."

Official Statistics Research Series, Vol 3, 2008                                             19
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Statistical balance should be considered, when deciding the level of aggregation in
establishing category boundaries. Although is accepted that specific categories for
certain infrequently occurring types of instances may require a unique category for
operational purposes, unless such granularity is required for operational purposes,
aggregation in setting category boundaries should seek to make the frequency counts in
each category be of similar order.
The level of precision that values or categories can take should reflect the level of
precision appropriate for operational purposes in the context where the data is recorded.
For example, it is inappropriate to record the specific level of aggravation in a violent
offence when a member of the public is reporting an assault over the phone; conversely,
when charging an alleged offender, a specific charge is required.
When modifying or designing new classifications, the design should attempt to be robust
against future needs.
Statistical feasibility is part of this consideration. Instead of attempting to fit all
responses, as is required for a survey. For admin data we need to ensure all operational
scenarios can be fitted into the classification.

Mandatory entry
Data entry should not be prompted with default values in fields, as this introduces both
and error component and statistical bias.
Allow an 'Unknown' category wherever not knowing is a valid operational scenario. Avoid
force-fitting unknown, as doing so would introduce an error component.
IT front-end should force valid and mandatory entry of all fields which may be used in
producing statistical information.

Reliability of measurement
If a variable or category cannot be measured with adequate reliability, even if such
information is desired, it should be excluded from the statistics subset.
Rationale: Avoids misinformation
Definitions required for 'reliability' and 'adequate'. This may be difficult to assess.

Modifications to existing designs
Modifications to existing forms or IT applications may impact the continuity of time-series
or the reliability of existing variables.
Guidelines should be given in the redesign, to account for this. For example, any impact
analysis, mapping, metadata creation, classification version numbers, documentation of
form and system changes, etc.
One option is to create a new version number to reflect real world change or where there
is a change to structure or content. For example a new version could be created when

Official Statistics Research Series, Vol 3, 2008                                             20
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

new categories are required, old categories deleted, or where the statistical unit being
classified changes.

Also, where a definition changes, this needs addressing in metadata. For example, what
Police treat as being 'rural', even though the form doesn't change.
Ideally the data dictionary should include a database that identifies which variables are
contained in which forms and IT application screens. Modifications to forms, screens or
variables should check this record to maintain alignment.

Prevention of work-arounds
It is acknowledged that administrative data is used for different purposes by different
users throughout an organisation. Such users often seek to 'innovate' expedient
solutions to report statistics, without giving due consideration to 'quality' issues.
Although there is no single method of preventing this, a combination of steps should be
taken, if an organisation wishes to ensure its statistical information has adequate quality.
These steps should attempt to make it easy for staff to do the right thing, make it hard for
staff to break the rules, and include consequences for breaking the rules Specific steps
an organisation can take include:
    •    Producing guidelines
    •    Producing policy
    •    Including compliance in audit and performance management frameworks
    •    Ensuring back-end IT applications used to extract and present statistical
         information, restrict inclusion of non-compliant variables/fields in statistical
         reports. Optional tactical approaches include:
         1. Ensure that any back-end applications used to extract and present statistical
            information, are designed in such a way that variables outside the statistics
            subset cannot be used as statistical attributes. For example, they cannot be
            used in SQL Condition statements or as categorical variables in SQL Select
            statements.
         2. Ensure that any back-end applications used to extract and present statistical
            information are designed in such a way that if variables outside the statistics
            subset are used to select records to report, they can only be used to report
            lists; not measures.
         3. Ensure that any back-end applications used to extract and present statistical
            information are designed in such a way that if variables outside the statistics
            subset are used to select records to report, such reports will not include
            algebraic computations. (E.g. summation)

Official Statistics Research Series, Vol 3, 2008                                             21
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police

Appendix 2: Recommendations for the design-rules
Determine the scope of the change
When only a part of a form or IT system is being modified, a decision is required on
whether to limit the application of these design-rules to just that part or whether to take
the opportunity of reviewing the whole form, IT application, or relevant module of the IT
application.
This decision should be made on a case-by-case basis, applying the principles
articulated in section 2.1 of Appendix 1.
The policy should recognise that judgment needs to be applied. However, it should
always require consideration of the principles articulated in section 2.1 of Appendix 1
and, where appropriate, provide documentation (E.g. metadata, training materials, etc.).

Define the variables/fields to incorporate, with the aid of a Data
Dictionary
Check all of the variables determined to be in-scope in section 3.1 above, against the
organisation's data dictionary.
If a proposed variable is the same or similar to an existing variable in the data dictionary,
where such similarity is considered adequate for operational purposes, the existing
variable should be used in preference to creating a new one.
In general, use a structured format in preference to free text, wherever possible.
Wherever the gatekeeper proposes a change from the proposal, this should be worked
through with the proposer and, in turn, stakeholders identified on the template.
Variables on different systems with the same label must be fully compliant with the
specification in the data dictionary, including definitions, categories, etc.
Where no suitable variable already exists in the data dictionary, consideration should be
given to external standards that Police may desire to be compatible with. For example,
we may wish to compare crime statistics with Australia. Data dictionaries for the
Australian Standard Offence Classification (ASOC) and the Recorded Crime Victims
Statistics (RCVS), should be considered.
Where no such explicit standard applies, the Justice Sector data dictionary should be
consulted and, failing that, Statistics New Zealand should be consulted, in order to
identify a potentially suitable existing variable.
Where none exists, a new variable should be created and entered into the data
dictionary, with all required information about it completed. This new variable must have
a unique label - one that is not already in use elsewhere. Subject to this limitation, the
label should be as intuitive as possible.
Free-text format fields should only be used in the statistics subset as a last resort. Where
they are included, comprehensive instructions must be provided, specifying how to use
these fields, and coding schemes must be created to encode data in them for the
purpose of producing statistical information.

Official Statistics Research Series, Vol 3, 2008                                             22
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3
You can also read
NEXT SLIDES ... Cancel