Towards High Quality Administrative Data - A Case Study: New Zealand Police
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Towards High Quality Administrative Data - A Case
Study: New Zealand Police
Gavin M. Knight
New Zealand Police
__________________________________________________________________
This report was commissioned by Official Statistics Research, through Statistics New
Zealand. The opinions, findings, recommendations and conclusions expressed in this
report are those of the author(s), do not necessarily represent Statistics New Zealand
and should not be reported as those of Statistics New Zealand. The department takes no
responsibility for any omissions or errors in the information contained here.
Citation: Knight, G. (2008). Towards high quality administrative data – A case study: New Zealand Police,
The Official Statistics System, Wellington, Official Statistics Research Series, Vol 3
ISSN 1177-5017
ISBN 978-0-478-31514-1 [Online], available: www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police Abstract Much had been written about principles and standards for designing surveys to ensure good quality statistical information results. Less has been written about standards for administrative data. Whereas it is thought that many of the same principles may apply, the terminology is often different and contextual differences exist that may require changes in the form, if not the substance design- standards. For example, a survey questionnaire is usually designed to be completed by a sampled respondent just once, whereas a form used to capture data for an operational IT system may be filled out many times a day by the same person in order to record information required by that person to perform their job. Efficiency and relevance may therefore have different implications for the design of such forms. This paper documents a project undertaken as a case-study on New Zealand Police that sought to identify principles to assist with designing good quality administrative data. Recommendations are made, based on these principles. Keywords: Administrative data, quality, form, New Zealand Police Official Statistics Research Series, Vol 3, 2008 2 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Contents The main body of this paper describes the project, its methodology, and a summary of results and conclusions. Background..................................................................................................................... 4 Methodology ................................................................................................................... 4 Phase I: Business Processes.......................................................................................... 5 Description of Phase I ................................................................................................. 5 Results of Phase I ....................................................................................................... 6 Phase II: Design-Principles ............................................................................................. 7 Description of Phase II ................................................................................................ 7 Results of Phase II ...................................................................................................... 8 Recommendations .......................................................................................................... 9 Process and accountabilities ....................................................................................... 9 Design-rules for guardians ........................................................................................ 10 Ensuring compliance ................................................................................................. 10 Next Steps .................................................................................................................... 11 Conclusions .................................................................................................................. 12 Appendix 1: Summary of focus-group workshops .........................................................13 Appendix 2: Design-rule recommendations...................................................................22 Appendix 3: Integrationworks' report at conclusion of Phase I.......................................25 Official Statistics Research Series, Vol 3, 2008 3 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Background This project was undertaken between November 2006 and August 2007, as part of the Official Statistics Research programme administered by Statistics New Zealand. Principle partners in this project were New Zealand Police (NZ Police), which was responsible for leading the research and was the subject of the case-study, and Statistics New Zealand (Stats NZ), which not only contributed significant funding to the project, but also provided a 'project ownership' role to which Police were accountable for reporting and implementing the project as planned. NZ Police, Stats NZ and other organisations identified later in this paper, contributed to the research itself. The project aimed to produce a standard, consisting of principles and rules, as appropriate, that will be applied by NZ Police to ensure capture of quality statistical information in administrative systems, and which is sufficiently generic that it could potentially be applied by other agencies. Note: This paper uses the terms 'standard', 'principles', and 'rules' as interrelated concepts, often preceded by the word 'design'. When the word 'standard' is used as a noun in this paper, it should be interpreted as a combination of 'rules' and 'principles'. It may be thought of a concept; not necessarily a specific tangible document. Rules, on the other hand, must be explicitly documented. Methodology The project took the form of a case study on NZ Police, with two phases. The first phase aimed to understand the current business processes where decisions are made about design of forms and IT applications then, from this understanding, identify where in these processes design-principles should be applied. The second phase aimed to determine what these design-principles should be. An experienced project team was formed, whose makeup varied throughout the project, adapting to the requirements of the current stage of work. However, for continuity, two members of the team were involved from start to finish, being Gavin Knight from Police National Headquarters (PNHQ) and Simon Thomson from Stats NZ's Collection and Classification Standards unit. The lead for Phase I was contracted out to Integrationworks - a firm specialising in data, system and application integration. Integrationworks reported to the project team, which acted as a steering group. This phase involved interviewing Police staff involved in changing forms, IT applications and business processes. It also reviewed police documentation about the business processes relevant to these functions. Phase II was lead by Police, who facilitated a number of workshops, as a form of focus- group, involving practitioners from various government agencies who work with administrative data. This group, informed by existing literature and the results of Phase I, discussed practical implications and issues. It considered options for addressing these through an administrative data design-standard (consisting of principles and rules), and formed, by consensus, a view of what this standard should be. Official Statistics Research Series, Vol 3, 2008 4 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police
Phase I: Business Processes
Description of Phase I
Phase I commenced in November 2006 and concluded in March 2007.
Initially in Phase I, the project team consisted of:
• Gavin Knight, National Statistics Manager, PNHQ, NZ Police
• Fiona Morris, Performance Officer, PNHQ, NZ Police
• Simon Thomson, Statistical Analyst, Collection and Classification
Standards, Stats NZ
• Bridget Murphy, Justice Subject Matter Project Manager, Social
Conditions, Stats NZ
• Barb Lash, Statistical Analyst, Social Conditions, Statistics New Zealand
PNHQ project team members, assisted by members of Police's Information Technology
Communications Service Centre (ICTSC), selected 'Integrationworks' from three
prospective IT consulting firms, to take the lead in Phase I.
Nick Borrell from Integrationworks interviewed eighteen Police Subject Matter Experts
(SMEs), identified by PNHQ members of the project team. These SMEs included a mix
of sworn and non-sworn police staff and represented a variety of roles within Police,
including:
• Systems analysts
• Business analysts
• Project managers
• A file centre manager
• Various IT managers
• Area commanders
• An Area tactical response manager
• Intelligence section supervisors
Most SMEs interviewed were of middle-management in seniority, ranging from Sergeant
to Inspector in rank or rank-equivalent (non-sworn staff).
In selecting SMEs to interview, Police took into account who had been involved in
requesting or managing changes to forms, IT applications and business processes,
either regularly, or in recent projects or initiatives.
In addition to interviewing SMEs, Nick Borrell reviewed existing Police documentation
relating to the business processes involving making changes to forms and IT
applications.
The project team acted as a steering group for Phase I, providing direction and feedback
to Nick Borrell, as the project progressed. It also assisted with answering questions and
removing roadblocks, such as facilitating access to SMEs.
Official Statistics Research Series, Vol 3, 2008 5
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
By the end of Phase I, project team makeup had altered slightly, as Stats NZ sought to
provide the appropriate expertise to provide feedback to Nick Borrell and review Nick's
draft report. In particular Stats NZ replaced Bridget Murphy with Liping Jiang, Subject
Matter Project Manager, Collection and Classification.
Results of Phase I
Nick Borrell submitted his final report on 22 March 2007. This report described the
process undertaken, documented findings, and made a number of recommendations,
primarily relating to the business processes by which forms and IT applications are
designed.
Nick's report is attached as Appendix 3. Key findings included:
• No framework exists to standardise and manage data capture,
• There are no resources (e.g. guideline manuals) available to staff for
addressing data capture,
• The existing process to manage form-changes does not consider data quality
or standards,
• The process to manage forms can be circumvented by staff - leading to
unauthorised changes,
• Staff do not appear to know how to initiate changes to forms or policing
procedures, and
• ICTSC is perceived as the de-facto owner of all data quality issues, yet
resolving issues of statistical information quality is not a core function of
ICTSC.
The report also made a number of recommendations aimed at addressing the key
findings in a way that minimises the barriers to implementing change, by avoiding
significant process reengineering. Instead, the report takes into account existing
business processes and functional groups. It recommends the minimum modification
necessary to existing processes and work-group functions to effect required
improvements.
It is acknowledged that this tactical approach applied to other organisations may result in
different business processes. However, there is a tension that needs to be balanced
between achieving buy-in of an organisation to making change, and creating a business
process that was common to all organisations. The latter was viewed as unrealistic and
not necessarily desirable anyway. Whereas principles for data quality may be common, it
may be appropriate for different types of organisations to have different business
processes.
Phase I therefore made recommendations about distinct features of an effective process,
rather than simply recommending a specific process. Such features may have greater
relevance to other organisations than the Police-specific processes that are
recommended.
The key recommendations in Integrationworks' report are:
Official Statistics Research Series, Vol 3, 2008 6
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
• Create a data quality framework which supports guidelines and principles
for data quality,
• Establish form-change guardianship,
• Appoint data quality guardianship to IT applications,
• Develop design-standards for data to be captured,
• Bring ICTSC processes in line with design-standards, and
• Develop policy to support design standards.
These are detailed more fully in Appendix 3. However, in short, they involve
standardisation of processes, creation and enforcement of design-standards, and
allocation of responsibility for application of standards when changes to forms and IT
applications are proposed by the business.
Phase II considered what these design-standards should be.
Phase II: Design-Principles
Description of Phase II
Phase II commenced in April 2007 and concluded in August 2007.
The project team for Phase II consisted of:
• Gavin Knight, National Statistics Manager, PNHQ, New Zealand Police
• Chris Worsley, Statistics Business Analyst, PNHQ, New Zealand Police
• Simon Thomson, Statistical Analyst, Collection and Classification
Standards, Statistics New Zealand
• Matt Flanagan, Statistical Analyst, Collection and Classification
Standards, Statistics New Zealand
• Barb Lash, Statistical Analyst, Social Conditions, Statistics New Zealand
• Robyn Smits, Manager, Data Management Unit, Ministry of Education
• Dr. Karolyn Kerr, Manager Information and Analysis, Central Region
Technical Advisory Services (Health)
• Jason Gleason, Senior Data Analyst, Justice Sector Information Strategy,
Ministry of Justice.
Additionally, Ian Smith, Police's National Applications Manager, and Senior Sergeant
Bernie Geraghty, Police's National Coordinator of Business Analysts, attended one
meeting. Senior Sergeant Geraghty subsequently continued to provide feedback on
notes from workshop meetings.
Informed by the Phase I report, which identified how a design-standard would be used,
project team members from Statistics New Zealand collated documents containing
principles and standards which it was thought might usefully inform the development of
principles and design-rules for administrative data. These documents included:
Official Statistics Research Series, Vol 3, 2008 7
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
• "Quality Protocols" of the (New Zealand) Official Statistics System, produced
by Statistics New Zealand.
• "A Guide to Good Survey Design" ( July1995), produced by Statistics New
Zealand, ISBN: 0-477-06492-2
• "Best Practice Guidelines for Classifications", used by Statistics New
Zealand's 'Classifications and Standards' unit
• "Official Statistics System Administrative Data Guidelines"
• "Draft principles for designing forms, processes and IT applications to ensure
desired statistics have acceptable quality" (August 2006), a desk-file used by
the Statistics Unit at PNHQ
It was apparent to the project team that the principles in most of the above documents
had been developed from the context of surveys. In a couple of instances there had
been an attempt to adapt principles developed for surveys to administrative data.
However, gaps remained.
Having collated and reviewed the above documents, principles from them (many of
which appeared in more than one document) were explicitly identified and discussed by
the project team, in terms of their relevance to administrative data.
Project team members were asked to consider both these principles and gaps in the
principles, based on their experience in working with administrative data.
Discussion occurred in six workshops, occurring over a five-month period. Notes were
taken at the workshops, particularly concerning conclusions and the associated
rationale. These notes were reviewed by workshop participants between meetings.
There were no a-priori assumptions of validity or intrinsic merit of any suggestions
expressed by team members in the workshops. Team members were asked to consider
all information presented, taking into account both the consistency with their own
experiences and the soundness of the rationale behind ideas.
The result, which is documented in Appendix 1, should therefore not be treated as
empirical, but should be treated as expert opinions that have survived peer-review in a
focus-group context.
The project sponsor (Police) acknowledges the willingness of project team members to
participate in such an exercise, where opinions were challenged in an effort to achieve a
robust result. In general, project team members felt that the result was superior to what
could have been produced by any individual, with ideas from one team member
prompting ideas in others.
Results of Phase II
Notes from the discussions are attached as Appendix 1. A summary of key conclusions
is as follows:
Many of the principles that are applicable to design of survey data are applicable to
administrative data as well. However, sometimes a slight change in terminology is
needed, to reflect the different context.
Official Statistics Research Series, Vol 3, 2008 8
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
Ensuring good administrative data requires some additional components which are
either not relevant to or manifest differently from survey data. These include preventing
'work-arounds' by staff wanting to report statistical information without complying with
rules, and the need to relate level of detail in a classification to the level of detail required
by the underlying operational process generating the data.
The Phase I report (Appendix 3) proposed specific processes for developing forms and
IT applications. In Phase II, it was identified that the process of developing IT
applications typically produces increasing design-detail as the analysis and design
phases of IT projects progress. Not all of the factors necessary to determine alignment
with the principles and design-rules may be evident at the outset. The implication is that
the checking of a proposed design may be more iterative than indicated in the process
diagram suggested by the Phase I report.
As identified in Phase I, to effect quality statistical information from administrative data,
staff require resources (in the form of processes, practices and design guidelines)
relevant to their roles. Generic quality principles are not as useful as principles relevant
to the task at hand. The implication, recognised in Phase II, is that we must identify the
tasks and develop resources for each task. Such resources include:
• A template and guideline for developing a proposal for a new form or
modification to a form or IT application. (proposer)
• A manual for checking alignment of a proposal with the design-rules.
(guardian)
• Documentation of the business process(es) by which new and modified forms
and IT applications are made.
• A data dictionary for the organisation.
• Policy.
Recommendations
Process and accountabilities
1. Create a documented standard process for processing proposed forms or
modifications to forms and IT applications.
2. Create policy that assists and ensures compliance with this process.
3. Appoint a person or group in the organisation as 'guardian' in the process for
creating or modifying forms and provide this guardian with resources such as
training and/or guidelines that incorporate design-principles and rules. The
guardian's role is to consider proposed new forms or modifications to forms
against these design-principles and rules, provide feedback and suggested
changes to the proposer, and to ensure proposals do not proceed until and
unless they comply with the design-rules.
4. Appoint a person or group in the organisation as 'guardian' in the process for
creating or modifying IT applications and provide this guardian with resources
such as training and/or guidelines that incorporate design-principles and rules.
The guardian's role is to consider proposed IT application designs against the
Official Statistics Research Series, Vol 3, 2008 9
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
design-principles and rules, provide feedback and suggested changes to the
proposer, and to ensure proposals do not proceed until and unless they comply
with the design-rules.
• There may be a separate IT application guardian for each IT application,
or one group may take this responsibility for many or all IT applications.
The guardian may or may not be the same as the guardian for forms.
• For NZ Police, the guardian for forms should be the Applications Support
group in ICTSC, as this would require the minimum change from existing
accountabilities. The guardian or guardians for IT applications should be
determined by the Manager ICT. However, to avoid overlapping
accountabilities, as a minimum, there should be no more than one
guardian for any given IT application.
5. Appoint a person or group in the organisation to have responsibility for
developing, maintaining and communicating 'design-rules for administrative data'
that ensure quality statistical information.
• For NZ Police this should be the National Statistics Manager in PNHQ, as
this role best aligns with existing competencies.
6. Have a standard 'change-request' form that accompanies proposals for new
forms or modifications to forms or IT applications. The change-request form is
distinct from any business case justifying and seeking approval for the change.
Rather, its purposes are to (a) prompt the proposer to identify and consult with
stakeholders, and (b) capture the information required by the guardian to assess
the proposal against the design-rules and provide feedback or suggest changes
in order to comply.
7. Create a data dictionary for the organisation, as a central register of variables,
containing their labels, definitions, formats, scales, ranges, and instructions to be
given to people capturing data.
• The data dictionary should be made as accessible as possible on-line to
all staff in the organisation (E.g. Police).
Design-rules for guardians
8. A manual should be created to assist guardians in checking proposals against
design-rules.
• The manual should incorporate the recommendations in Appendix 2.
Ensuring compliance
9. Policy should be created to assist and ensure compliance with these
recommendations
10. IT designers should endeavour to prevent statistical reporting of data that has
been excluded from the data to which the design-rules have been applied.
Official Statistics Research Series, Vol 3, 2008 10
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
• For example, for NZ Police this may involve ensuring all statistical
reporting occurs via the data warehouse, rather than directly from
proprietary operation IT systems. It may also require that 'Business
Objects' universes be designed to support the 'design-rules' and not
report other data in a form that facilitates statistical reporting.
• One mechanism for controlling this is through the governance processes
for approving expenditure. For example, the organisation should consider
how payments are currently approved to IT system suppliers for
developing reports or reporting capability on their systems.
• One type of change-request may involve making data from an operational
IT system conform with the design-rules, thereby improving its quality and
enabling statistical reporting from it.
11. IT designers should endeavour to prevent creation of or modification to database
fields that bypass guardians and/or do not comply with the design-rules.
12. The Assurance group at PNHQ should incorporate auditing of new or newly
modified forms and IT applications into its audit framework, to check for
compliance with policy.
Next Steps
13. Statistics NZ should consider the potential for developing a guideline/manual for
government agencies, incorporating principles recommended by this project.
14. Police's National Statistics Manager should, in consultation with ICTSC, develop
the recommended manual for guardians and the template and guidelines for
proposing new forms or modifications to forms and IT applications.
15. Police's ICTSC should, in consultation with the Statistics Unit in Organisational
Performance Group PNHQ, document a standard process for creating and
modifying forms and IT applications.
16. Police's Applications Support group should, in consultation with the Statistics Unit
in Organisational Performance Group PNHQ, develop an initial data dictionary
and a manual for its use, then integrate maintenance of the data dictionary into
the above new standard process.
17. The Manager ICT should allocate guardianship responsibilities to staff, in line
with the recommendations of this project.
18. Police's Policy unit should draft, consult, and seek appropriate approval for policy
that supports these recommendations.
Official Statistics Research Series, Vol 3, 2008 11
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
Conclusions
This project, although not empirical, has provided a valuable step forward in
understanding both design-principles for administrative data, and issues that require
addressing in order to introduce such principles into the business.
The project did not examine in any detail what constitutes 'quality' in administrative data.
Rather, using 'quality' principles already identified in the literature, the project looked at
how these manifest in an administrative data context and considered design-principles to
address them.
Many of the principles applicable to design of survey data are also applicable to
administrative data. However, the context of creating survey data differs markedly from
that of creating administrative data. This leads to a need for creation of resources in a
variety of forms, in order to effect good quality statistical data. These resources include:
• Standard documented processes,
• An organisation-wide data dictionary,
• Allocation of defined roles and responsibilities,
• Documented guidelines for staff to undertake these responsibilities,
• Organisational policy to assist and ensure compliance, and
• Audit, to ensure compliance.
Although undertaken as a case study on New Zealand Police, most results - at the
principles and rules level - do not appear to be uniquely applicable to Police, but would
be of more generic applicability to any organisation producing administrative data.
Results are therefore encouraging.
Official Statistics Research Series, Vol 3, 2008 12
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
Appendix 1: Notes from workshops in Phase II
1. Aims of this policy
A policy is required that assists and ensures Police staff create and modify forms and IT
application in a standardised way that incorporates principles which ensure quality
statistical information can be produced.
Determining content is not in scope for the policy/guidelines
This policy affects front-end design; not IT system data architectures or back-end
extraction tools and standards. These are also required to deliver quality statistical
reporting, but are out of scope for this particular policy.
This policy does not dictate what information Police must record; neither does it dictate
what recorded information will be used to generate statistical information. Instead, once
the business has made these decisions, this policy affects aspects of how forms and IT
application front-ends will be designed.
That said, it is acknowledged, that unless centralised control exists over the process by
which decisions are made about the design of forms and IT applications, standards will
be impossible to apply in practice. Therefore, this policy must include relevant aspects of
business process.
As part of this, consideration must be given to whose responsibility it is to determine
what statistical information is required from a particular form or IT application
development. Options considered, along with their strengths and weaknesses were:
Option Strengths Weaknesses
1. Gatekeeper/ • Centralised role who is • Places much higher responsibility on
guardian involved in the process at the guardian than envisaged.
appropriate stage. • Guardian is a facilitator, rather than
a business owner.
2. Proposer • Likely to come from the • Involvement in this process is one-
business area driving the off, so is unlikely to have expertise
change. in identifying stakeholders and
determining information needs.
3. Business owner of • Will be a key stakeholder for • May not appreciate broader
the functional area the affected data and likely to synergies, information management
that creates the data understand the business implications and stakeholder needs.
implications in that area.
4. Business owner of • Will understand the directly • IT groups should have a capability
the IT application affected system. responsibility, rather that business
ownership of the data.
5. A centralised non-IT • Is centralised and is likely to • This function may not be consistent
business information understand statistical issues with the core function of the group.
group (E.g. and how to identify breadth of
Headquarters information needs.
Statistics Unit)
Additionally, wherever the function sits, there may be resourcing implications.
Official Statistics Research Series, Vol 3, 2008 13
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
Conclusion: It is not necessary for one group to carry this responsibility alone. Instead, a
standardised 'change-request' form should be created for use at step 2 of the processes
recommended in sections 3.2 and 3.3 of the Integrationworks report (Appendix 3).
Responsibility for completing the change-request form should be with the proposer.
However, the form should guide the proposer to identify stakeholders and ensure they
are consulted. Such stakeholders will be a mix of operational business managers and
centralised expert groups. E.g. District Commanders, National Statistics Manager and
the Manager Applications Support.
On certain key projects, it may be appropriate for one of these stakeholders to take over
responsibility for proposing the change.
Such an approach leaves responsibility for doing the leg work with the proposer, but
ensures input from a centralised group who can apply expertise that the proposer does
not necessarily have.
So what is in-scope? - Designing the capture of what is recorded
Our aim is to ensure the quality of whatever statistical information Police desire to be
produced by:
• Requiring that any proposed creation or change of a Police form be considered
by a 'gate-keeper' or 'guardian' with the responsibility and knowledge to ensure
the form is designed in such a way that resultant statistical information will be of
good quality.
o The project team notes that what is meant by 'quality' requires definition.
However, such a definition may not differ markedly from existing
documented definitions regarding survey data, such as are contained in
the documentation reviewed.
• Requiring that any proposed creation or change of a Police IT application be
considered by a 'gate-keeper' or 'guardian' with the responsibility and knowledge
to ensure the IT application will capture information in such a way that any
statistics Police desire be produced from this system will have acceptable quality.
• Ensuring that these gatekeepers ask the following questions:
o "Of all of the information that is proposed to be recorded on the form/in
the IT application, what subset of this information do Police require
statistics to be derived from?" (the statistics subset), and
o "Is this subset sufficient to provide the desired statistical information?" (If
not, either the statistics subset needs to be extended, additional
information needs to be recorded or, where not possible, it needs to be
accepted that not all of the desired statistical information can be produced
from this particular form or IT application.)
Note: It is anticipated that most data recorded will, by default, be
part of the statistics subset unless specifically excluded.
Subsequent discussion will therefore speak as if all data is
included in the statistics subset. It is noted here, however, that
Official Statistics Research Series, Vol 3, 2008 14
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
design-principles applied to this data will not necessarily be
relevant to data specifically excluded from the statistics subset.
• Preventing data that has been excluded from statistics subset from being
extracted from Police systems in a way that enables such data to be used from
calculating statistical information. E.g. A free-text field used for capturing rough
notes may be excluded from the statistics subset. This field may be extracted to a
data warehouse along with other data and may be queried, but it may not be
used to select records for aggregation, such as in a 'Condition' statement in SQL.
(This stops work-arounds that would jeopardise quality by avoiding proper
design.)
• Providing guardians with design-rules that ensure forms and IT applications
capture data in such a way that good quality statistical information will result.
Note: This policy alone cannot ensure quality. It will do so as far as the design of forms
and IT applications is concerned, however to complement this policy, three further
aspects are required:
• policy is required to govern staff recording practices,
• data storage, transformation and access mechanisms (I.e. the IT back-end) need
to be designed appropriately, and
• correct practice throughout the system needs to be monitored in a performance
management framework.
These components are addressed through other aspects of Police's Statistics Strategic
Plan.
Furthermore, some important existing Police systems have already been built in a way
that would not comply with this new policy. As result, statistical information available
from these systems is inferior in scope and quality than desired. Projects to
retrospectively redesign aspects of these systems would be required if these limitations
are to be addressed.
This new policy will, however, prevent a repeat of bad design practices on these legacy
systems.
2. Design Rules (drafted as if for guardian)
Full vs partial modification
When only a part of a form or IT system is being modified, a decision is required on
whether to limit the application of these design-principles to just that part or whether to
take the opportunity of reviewing the whole form, IT application, or relevant module of
the IT application.
Whereas any change to a form or IT application provides an opportunity to address a
number of issues at the same time, a comprehensive change may require extensive
investment that may not be warranted by the desire for a minor improvement in a form.
Official Statistics Research Series, Vol 3, 2008 15
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
Four distinct issues need to be considered:
• implications for the form/screen layout,
• implications for other systems that might use the same variable(s),
• implications for time-series, and
• operational implications.
Implications for the form/screen layout
This is the simplest impact to assess. The main requirement is that any changes must be
consistent with the rest of the form and avoid confusion. E.g. another part of the form
may refer to a field in the affected portion.
Implications for other systems
This may be more difficult to assess, unless some mechanism to link and/or match fields
in different systems exists.
If affected variables are contained in the organisation's data dictionary and if this data
dictionary identifies all forms and systems using the same variable, this will make the
task easier.
Conclusion: A data dictionary should be created for the organisation. (Refer section 3.3
for details).
Implications for time-series'
Altering variables or classifications, can interrupt time-series'. For example, mandating
entry of what was previously an optional field, introducing a new category, obsoleting an
existing category, or changing whether or not 'don't known' or 'missing data' is permitted.
Similarly, even altering how information is prompted can effect what is entered.
Literature on design of survey measuring instruments makes this clear. In the absence of
evidence to the contrary, we should not assume this principle differs for administrative
data.
Changes should therefore only be made if important and, wherever possible, where the
impacts on time series' has been analysed.
Metadata accompanying resultant statistical information should note changes that
potentially affect time-series'.
Operational implications
Will the change being proposed for the form/IT application work for the business,
including all affected stakeholders? For example, if a new field/variable is proposed, is it
operationally appropriate to collect this information at the stage of the business process
that the data is being recorded, or if a new category is being proposed, does that
category have operational relevance?
Official Statistics Research Series, Vol 3, 2008 16
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
Conclusion
The decision about whether to make a partial or full modification when a change is
proposed, should be made on a case-by-case basis. The policy should recognise that
judgment needs to be applied. However, it should always require consideration of the
above four types of implications and, where appropriate, provide documentation (E.g.
metadata, training materials, etc.).
Statistical units vs attributes
In short, both may be variables, appearing as fields in forms and data entry screens.
However, statistical units are what is counted and attributes are the descriptors of what
is counted, and can be used to select what to count. (E.g. in 'Condition' statements of
SQL queries.)
Statistical units (or 'measures')
Considering the information to be collected on a form or in an IT system record, identify
the desired statistical units. I.e. what is to be counted.
Statistical units can be considered as being one of two types: 'Direct' and 'Derived':
Direct: These will typically be the 'record' (E.g. a form) that corresponds to the action or
object to which the form relates. For example, a transaction, occurrence, property
item, person, etc.
Derived: These do not necessarily have a one-to-one relationship with the core form or
record being captured in an instance. Derived measures need to be considered
carefully, firstly to determine feasibility and how to derive desired measures that
are not one-to-one with records, and secondly to be sure that the measure's
definition is valid.
An example of a derived measure would be where we wish to count the number
of children present at domestic violence incidents. A single record per incident is
created from the relevant form (for Police, this is the POL400 form). However,
this form contains fields specifying the number of children present at the incident.
So it is possible to count the number of children present at domestic violence
incidents.
From a statistical perspective, where it is possible to obtain statistical units both directly
and derived, it is preferable to obtain them directly. (E.g. a separate form for each person
present at an incident, rather than a count of such people on the incident report.)
However, this needs to be balanced against respondent burden, firstly through requiring
additional information be recorded that is not required for operational purposes, and
secondly through double-entering information that may already be recorded elsewhere in
the system.
Attributes
Identify the factors that will be used to select and characterise statistical units. These are
known as 'attributes'. Typical attributes may be:
Official Statistics Research Series, Vol 3, 2008 17
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police
• time periods,
• geographical areas,
• categories of measures, such as job-type, ethnicity, gender, age, cost-centre,
job-closure category, etc. or
• numerical values that describe the measure object, such as age, height, weight,
etc.
Different attributes may have different formats, such as categorical, numeric, date, etc.
Note: Unique identifiers of individual records (E.g. a file number) are not statistical
attributes, as they only apply to a single record and do not characterise measures.
Once statistical units and attributes have been decided, determine the hierarchical
relationships. For example, one Occurrence may contain one or more Offences, for
which there may be none, one or many Apprehensions of offenders. Similarly, one
District may contain a number of Police Areas which may contain a number of Police
Stations.
Check fitness-for-purpose
Design the change-request template indicated in sections 3.2 and 3.3 of the Stage 1
report to provide all of the information required by the guardian. For example, it should
work through all fields in the proposed form or IT application input screens, identifying
which fields are to be excluded from the statistics subset. For all other fields, suggested
labels, definitions, variable-type, scale and range, should be included in the template.
The template should also identify stakeholders of the relevant statistical information and
have their sign-off. Such sign-off is to mean that the stakeholder is satisfied that the
statistics subset is sufficient to provide all of the statistical information they expect from
what is collected on that form or IT application input screen.
In deciding how much data to record, balance respondent burden with desired
information. Also, design the structure of any data recoded to maximise the potential for
integrating other data and deriving additional information.
The template must make it clear to the proposer and stakeholders that it will be
impossible to produce statistics using excluded fields.
Design of variables
Check all of the variables in the proposed form or IT application against the
organisation's data dictionary.
This data dictionary is a central register of variables, containing their labels, definitions,
formats, scales, ranges, and instructions to be given to people capturing data.
The data dictionary is not specific to IT systems; rather it defines variables that may
appear in any system. It is important to retain the distinction between variable definitions,
which are about information, and IT systems, which are about technology.
Official Statistics Research Series, Vol 3, 2008 18
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police If a proposed variable is the same or similar to an existing variable in the data dictionary, where such similarity is considered adequate for operational purposes, the existing variable should be used in preference to creating a new one. In general, use a structured format in preference to free text, wherever possible. Wherever the guardian proposes a change from the proposal, this should be worked through with the proposer and, in turn, stakeholders identified on the template. Variables on different systems with the same label must be fully compliant with the specification in the data dictionary, including definitions, categories, etc. Where no suitable variable already exists in the data dictionary, consideration should be given to external standards that Police may desire to be compatible with. For example, we may wish to compare crime statistics with Australia. Data dictionaries for the Australian Standard Offence Classification (ASOC) and the Recorded Crime Victims Statistics (RCVS), should be considered. Where no such explicit standard applies, the Justice Sector data dictionary should be consulted and, failing that, Statistics New Zealand should be consulted, in order to identify a potentially suitable existing variable. Where none exists, a new variable should be created and entered into the data dictionary, with all required information about it completed. This new variable must have a unique label - one that is not already in use elsewhere. Subject to this limitation, the label should be as intuitive as possible. The data dictionary should be made as accessible as possible on-line to all staff in Police. Free-text format fields should only be used in the statistics subset as a last resort. Where they are included, comprehensive instructions must be provided, specifying how to use these fields, and coding schemes must be created to encode data in them for the purpose of producing statistical information. Note: The contents of excluded fields may be reported as qualitative information, but excluded fields may not be used to determine which records to aggregate when calculating statistics. (E.g. they are not to be used in 'Condition' statements of SQL queries.) Design of classifications Statistics New Zealand's 'Best Practice for Classifications' should be referred to and complied with when designing categorical variables in forms and IT applications. Define 'classification'. Some specific requirements include: For categorical variables, categories must be mutually exclusive and exhaustive. For each categorical variable, a flat or hierarchical classification structure must be used. "A flat structure should be used when a simple listing is required or when there is no requirement to aggregate or group categories into more meaningful categories." Official Statistics Research Series, Vol 3, 2008 19 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Statistical balance should be considered, when deciding the level of aggregation in establishing category boundaries. Although is accepted that specific categories for certain infrequently occurring types of instances may require a unique category for operational purposes, unless such granularity is required for operational purposes, aggregation in setting category boundaries should seek to make the frequency counts in each category be of similar order. The level of precision that values or categories can take should reflect the level of precision appropriate for operational purposes in the context where the data is recorded. For example, it is inappropriate to record the specific level of aggravation in a violent offence when a member of the public is reporting an assault over the phone; conversely, when charging an alleged offender, a specific charge is required. When modifying or designing new classifications, the design should attempt to be robust against future needs. Statistical feasibility is part of this consideration. Instead of attempting to fit all responses, as is required for a survey. For admin data we need to ensure all operational scenarios can be fitted into the classification. Mandatory entry Data entry should not be prompted with default values in fields, as this introduces both and error component and statistical bias. Allow an 'Unknown' category wherever not knowing is a valid operational scenario. Avoid force-fitting unknown, as doing so would introduce an error component. IT front-end should force valid and mandatory entry of all fields which may be used in producing statistical information. Reliability of measurement If a variable or category cannot be measured with adequate reliability, even if such information is desired, it should be excluded from the statistics subset. Rationale: Avoids misinformation Definitions required for 'reliability' and 'adequate'. This may be difficult to assess. Modifications to existing designs Modifications to existing forms or IT applications may impact the continuity of time-series or the reliability of existing variables. Guidelines should be given in the redesign, to account for this. For example, any impact analysis, mapping, metadata creation, classification version numbers, documentation of form and system changes, etc. One option is to create a new version number to reflect real world change or where there is a change to structure or content. For example a new version could be created when Official Statistics Research Series, Vol 3, 2008 20 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police
new categories are required, old categories deleted, or where the statistical unit being
classified changes.
Also, where a definition changes, this needs addressing in metadata. For example, what
Police treat as being 'rural', even though the form doesn't change.
Ideally the data dictionary should include a database that identifies which variables are
contained in which forms and IT application screens. Modifications to forms, screens or
variables should check this record to maintain alignment.
Prevention of work-arounds
It is acknowledged that administrative data is used for different purposes by different
users throughout an organisation. Such users often seek to 'innovate' expedient
solutions to report statistics, without giving due consideration to 'quality' issues.
Although there is no single method of preventing this, a combination of steps should be
taken, if an organisation wishes to ensure its statistical information has adequate quality.
These steps should attempt to make it easy for staff to do the right thing, make it hard for
staff to break the rules, and include consequences for breaking the rules Specific steps
an organisation can take include:
• Producing guidelines
• Producing policy
• Including compliance in audit and performance management frameworks
• Ensuring back-end IT applications used to extract and present statistical
information, restrict inclusion of non-compliant variables/fields in statistical
reports. Optional tactical approaches include:
1. Ensure that any back-end applications used to extract and present statistical
information, are designed in such a way that variables outside the statistics
subset cannot be used as statistical attributes. For example, they cannot be
used in SQL Condition statements or as categorical variables in SQL Select
statements.
2. Ensure that any back-end applications used to extract and present statistical
information are designed in such a way that if variables outside the statistics
subset are used to select records to report, they can only be used to report
lists; not measures.
3. Ensure that any back-end applications used to extract and present statistical
information are designed in such a way that if variables outside the statistics
subset are used to select records to report, such reports will not include
algebraic computations. (E.g. summation)
Official Statistics Research Series, Vol 3, 2008 21
ISSN 1177-5017
www.statisphere.govt.nz/official-statistics-research/series/vol-3Towards High Quality Administrative Data – A Case Study: New Zealand Police Appendix 2: Recommendations for the design-rules Determine the scope of the change When only a part of a form or IT system is being modified, a decision is required on whether to limit the application of these design-rules to just that part or whether to take the opportunity of reviewing the whole form, IT application, or relevant module of the IT application. This decision should be made on a case-by-case basis, applying the principles articulated in section 2.1 of Appendix 1. The policy should recognise that judgment needs to be applied. However, it should always require consideration of the principles articulated in section 2.1 of Appendix 1 and, where appropriate, provide documentation (E.g. metadata, training materials, etc.). Define the variables/fields to incorporate, with the aid of a Data Dictionary Check all of the variables determined to be in-scope in section 3.1 above, against the organisation's data dictionary. If a proposed variable is the same or similar to an existing variable in the data dictionary, where such similarity is considered adequate for operational purposes, the existing variable should be used in preference to creating a new one. In general, use a structured format in preference to free text, wherever possible. Wherever the gatekeeper proposes a change from the proposal, this should be worked through with the proposer and, in turn, stakeholders identified on the template. Variables on different systems with the same label must be fully compliant with the specification in the data dictionary, including definitions, categories, etc. Where no suitable variable already exists in the data dictionary, consideration should be given to external standards that Police may desire to be compatible with. For example, we may wish to compare crime statistics with Australia. Data dictionaries for the Australian Standard Offence Classification (ASOC) and the Recorded Crime Victims Statistics (RCVS), should be considered. Where no such explicit standard applies, the Justice Sector data dictionary should be consulted and, failing that, Statistics New Zealand should be consulted, in order to identify a potentially suitable existing variable. Where none exists, a new variable should be created and entered into the data dictionary, with all required information about it completed. This new variable must have a unique label - one that is not already in use elsewhere. Subject to this limitation, the label should be as intuitive as possible. Free-text format fields should only be used in the statistics subset as a last resort. Where they are included, comprehensive instructions must be provided, specifying how to use these fields, and coding schemes must be created to encode data in them for the purpose of producing statistical information. Official Statistics Research Series, Vol 3, 2008 22 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
You can also read