IDI data dictionary: Ministry of Justice data - June 2018 edition - Stats NZ
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
IDI data dictionary:
Ministry of Justice data
June 2018 editionCrown copyright © See Copyright and terms of use for our copyright, attribution, and liability statements. Citation Stats NZ (2018). IDI data dictionary: Ministry of Justice data (June 2018 edition). Retrieved from www.stats.govt.nz. ISSN 2538-0141 (online) Published in June 2018 by Stats NZ Tatauranga Aotearoa Wellington, New Zealand Contact Stats NZ Information Centre: info@stats.govt.nz Phone toll-free 0508 525 525 Phone international +64 4 931 4600 www.stats.govt.nz
IDI data dictionary: Ministry of Justice data (June 2018 edition)
Contents
Purpose of this data dictionary.......................................................................................... 4
Background ........................................................................................................................... 4
Reference documents ........................................................................................................... 4
About the Ministry of Justice data ..................................................................................... 5
Changes made to data in the latest refresh ......................................................................... 5
Coverage................................................................................................................................ 6
Methodology ......................................................................................................................... 9
Quality information............................................................................................................... 9
Privacy, security, or confidentiality issues ......................................................................... 13
Dataset description......................................................................................................... 14
Summary table .................................................................................................................... 14
Dataset variables................................................................................................................. 16
Helpful information for analysis ...................................................................................... 28
What is the justice sector? .................................................................................................. 28
About how many people should I find in justice datasets? ............................................... 28
................................................................................................................... 29
What should you be particularly careful with? .................................................................. 29
Glossary ......................................................................................................................... 31
3IDI data dictionary: Ministry of Justice data (June 2018 edition)
Purpose of this data dictionary
IDI data dictionary: Ministry of Justice data (June 2018 edition) documents the content of the
dataset (MOJ) the Ministry provides to Stats NZ for use in the Integrated Data
Infrastructure (IDI).
This dictionary provides detail on the variables contained in the dataset, including technical
information and descriptions.
Use this data dictionary if you are interested in accessing Ministry of Justice data in the IDI.
Background
The Ministry of Justice dataset contains records of all charges processed by the criminal courts
since 1992. Each charge includes the date of the offence, when the charge was filed and when it
was resolved.
Reference documents
You may wish to check that numbers you get for initial exploratory analyses are broadly consistent
with the interactive tables available for justice statistics on NZ.Stat. Note that NZ.Stat has separate
tables for adults (17 years and over) and children and young people (aged 10 16). As this data
counts the most serious offence per person per year it is most suitable for overall comparisons,
rather than analysis or comparison of individual offence types.
Additionally, data tables are published on the Ministry of Justice website:
https://justice.govt.nz/justice-sector-policy/research-data/justice-statistics/data-tables/. The data
tables are the best source of information on specific offences or charge outcomes. They also
contain information on all people/identities charged. Between late March and late September, the
data tables contain calendar year data (eg to December 2017). Between late September and late
March, the data tables contain financial year data (eg to June 2018). Note that while the number of
charges each year will be the same in the data tables and in the IDI dataset, the number of
people/identities will differ (as described in Coverage: People/identities).
4IDI data dictionary: Ministry of Justice data (June 2018 edition)
About the Ministry of Justice data
Changes made to data in the latest refresh
Incorrectly labelled variables
In the March 2018 refresh the moj_chg_charge_outcome_calyear_nbr and
moj_chg_charge_outcome_finyear_nbr variables were mislabelled (calendar values are
2008/2009 rather than 2008, and vice versa). This will be corrected in the June 2018 refresh.
New variables added to the charges dataset
• moj_chg_sex_code provides gender information, including whether the record is for
an organisation, which otherwise only available for records linked to the IDI spine
• moj_chg_charge_outcome_cyear_nbr and moj_chg_charge_outcome_fyear_nbr -
replaces moj_chg_charge_outcome_year_nbr to have charge outcome year calculated
for calendar and financial years
• moj_chg_serious_cyear_rank_nbr and moj_chg_serious_fyear_rank_nbr - replaces
moj_chg_serious_year_rank_nbr to have rank_all calculated for calendar and financial
years
• moj_chg_remand_bail_code provides information on whether the person had been
remanded on bail (including EM bail)
• moj__chg_remand_custody_code provides information on whether the person had
been remanded in custody
• moj_chg_suppressed_charge_code provides an indicator of whether the charge had
suppressed information, which would explain why the identity could not be linked to
the IDI spine.
New identities added
The names associated with approximately 400,000 charges between 1992 and 2004, which
were not migrated from the LES case management system into CMS, have been provided
to Stats NZ. This has improved the linking of charges, despite some of the information
being poor quality (eg missing date of birth or first name), although this does include the
Updated metadata tables
• offence_code this is updated every refresh as new offences are added or other
updates are made. Note that there are some differences between the Ministry of Justice
(MOJ) and Police offence code tables, particularly in the ANZSOC (Australian and New
Zealand Standard Offence Classification) mapping. A project is underway to better align
all offence code information. As a result, we have made some improvements and have
changed the mapping of some offences in the MOJ offence code table. The most
significant of these is moving offences related to child pornography to 0322: Child
pornography offences, and the mapping of two cannabis equipment offences which
changed from possession/use to manufacture/supply. Additionally, several new
variables were added, including indicators/categorisations for family violence, driving
under the influence, sexual, cannabis, and methamphetamine offences. The Criminal
Procedure Act case category has also been included.
5IDI data dictionary: Ministry of Justice data (June 2018 edition)
• charge_outcome an interim youth outcome (INTACT) has been added, and
capitalisation and abbreviations have been corrected for some outcome_6cat and
outcome_11cat categories (eg YC proved is now Youth Court proved and DISMISSED is
now Dismissed).
• gender new table added coding Female, Male, Organisation, and Unknown.
• court Police Area has been added, along with geographic sort variables for Justice
service area, Police District, and Police Area.
Coverage
Reference period start: 1992 (defined by charge_outcome_year >= 1992). This 1992 start
point limits our ability to determine the offending history of older people.
Reference period end: as at the March 2018 refresh, data has been provided to the end of
June 2017. The June 2018 refresh will include data to the end of December 2017. The data
is updated six-monthly in the December and June refreshes.
Note that the dataset contains charge outcome year values for both calendar and financial
years. Users should take care when using data from the start or end point of the data, as
seen in Figure 1 and Figure 2.
Figure 1: Number of charges and number of identities, by charge outcome calendar year,
1992–June 2017
400,000
300,000
200,000
100,000
0
1992
1994
1996
1998
2000
2002
2004
2005
2006
2007
2009
2011
2013
2015
1993
1995
1997
1999
2001
2003
2008
2010
2012
2014
2016
Jan - Jun 2017
Number of charges Number of identities
Figure 2: Number of charges and number of identities, by charge outcome financial year,
January 1992–2016/17
400,000
300,000
200,000
100,000
0
1992/1993
1994/1995
1996/1997
1998/1999
2000/2001
2002/2003
2007/2008
2009/2010
2011/2012
2013/2014
2015/2016
1993/1994
1995/1996
1997/1998
1999/2000
2001/2002
2003/2004
2004/2005
2005/2006
2006/2007
2008/2009
2010/2011
2012/2013
2014/2015
2016/2017
Jan - Jun 1992
Number of charges Number of identities
6IDI data dictionary: Ministry of Justice data (June 2018 edition)
Geographic coverage: all New Zealand.
Target population: all charges filed in court that have been disposed (finalised; that is,
they have a charge outcome and, where appropriate, a sentence or sentences).
Observed population: all charges filed that have an outcome recorded in the Case
Management System (CMS), or its predecessors.
Analysis unit: charges (see Counting rules, below, for guidance on how to analyse the data
by people/identities or cases).
Counting rules
A criminal charge is the basic analysis unit.
A criminal charge may be filed by Police, Corrections, local authorities, or several other
government agencies. Normally each charge refers to one offence. An individual (person or
organisation) may for instance attend court on one occasion for three charges of burglary
and one of assault (four charges).
This dataset includes all charges that have been disposed, that is, for which an outcome
(and where applicable, a sentence) has been determined by the court. This includes
charges that were dismissed or not proceeded with, and where diversion has been
completed.
The dataset contains charges for both people and organisations. However, only records for
people can be linked to the IDI spine.
The number of charges each year can be checked against published data (such as the data
tables on MOJ website). Counts will be the same in the published data tables and in the IDI
dataset.
Convicted charges
To identify convicted charges, use the charge_outcome metadata table to identify the
appropriate charge outcome type codes (eg where charge_outcome_4cat = 1Convicted).
Note that charges in Youth Court usually result in a Youth Court proved outcome, rather
than a Convicted outcome. To identify these outcomes, use where charge_outcome_6cat
= YC proved . Alternatively, use the youth specific charge outcome where outcome_youth
= 2Youth Court proved s283 order . See Children and young people: they are usually
handled very differently for more information about youth.
Also note, proved outcomes (those resulting in a guilty plea or finding of guilt) can be used
for some analysis (such as calculating reoffending). Proved outcomes include convicted,
Youth Court proved, discharge without conviction, adult diversion, and Youth Court
discharge. To identify these outcomes, use where charge_outcome_4cat = 1Convicted or
2Other proved .
People/identities
For analysis at MOJ we treat organisations and people charged with offences the same.
However, note that linked to the IDI spine.
7IDI data dictionary: Ministry of Justice data (June 2018 edition)
To count people/identities charged in a year, count unique snz_uids within the year.
To count people/identities convicted in a year, first restrict to convicted charges and
then count unique snz_uids.
When we are looking at offending details related to a person/identity we usually look at
their most serious charge in each year. To identify the most serious charge per
person/identity per year, select the charge with the lowest rank within the financial or
calendar year, that is, the lowest moj_chg_serious_fyear_rank_nbr or
moj_chg_serious_cyear_rank_nbr (eg moj_chg_serious_fyear_rank_nbr = 1).
As noted in the Dataset variables, the moj_chg_serious_cyr_rank_nbr and
moj_chg_serious_fyr_rank_nbr are calculated at MOJ prior to transfer to Stats NZ. Due to
the linking process, several MOJ identities can be combined into a single snz_jus_uid and
snz_uid. This means that an identity could then have duplicate
moj_chg_serious_cyr_rank_nbr or moj_chg_serious_fyr_rank_nbr values. If there are two
or more with the same rank value the most serious can be determined based on the
following sort order (note that not all variables are included in the IDI dataset):
proc sort data=rank_all out=rank_all_sort_cyr;
by charge_outcome_cyr master_prn
outcome_rank
sent_rank1 descending swgt1 sent_rank2 descending swgt2
sent_rank3 descending swgt3
sent_rank4 descending swgt4 sent_rank5 descending swgt5
/* descending non_parole_years descending
non_parole_months*/
cust_rem bail
/* charge_laid_rank descending blood_breath_alcohol*/
descending seriousness_score
descending max_years descending max_months descending
max_days
descending max_fine
imprisonable anzsoc_code offence_code crn;
run;
Where charge_outcome_calyear = moj_chg_charge_outcome_cyear_nbr; master_prn = snz_jus_uid;
outcome_rank = outcome_rank in the charge_outcome metadata table; sent_rank1 = sent_rank in the
sentence metadata table associated with moj_chg_serious_sentence1_code; swgt1 =
moj_chg_sentence1_weight_nbr; seriousnessscore, max_years, max_months, max_fine, imprisonable, and
ANZSOC_code are in the offence_code metadata table; offence_code = moj_chg_offence_code; crn =
snz_moj_charge_uid.
The number of people/identities each year can be checked against published data (such as
the data tables on MOJ website . However, the number of people/identities in the IDI
dataset will not equal the number of identities each year in tables using MOJ data
extracted externally to the IDI. There are two reasons for this:
• suppression - when suppression is applied to a charge the unique identifier
(master_prn) is replaced. As a result, if a person has suppressed and non-suppressed
charges in the same year they will have two identifiers (one which can link to the IDI
This means that the dataset supplied to the IDI has more
identities in it than that dataset normally used for analysis at MOJ.
• snz_uid when Stats NZ determine that multiple MOJ identities are the same person
they are joined together into a single snz_uid. This means that the IDI dataset will have
8IDI data dictionary: Ministry of Justice data (June 2018 edition)
fewer identities in it than in the dataset supplied to the IDI, and few identities than in
the dataset normally used for analysis at MOJ.
The differences in the number of identities can be seen in Figure 3, which shows an
indexed comparison of identifiers: the MOJ master_prn as is used for reporting at MOJ, the
higher number of MOJ master_prns as supplied to the IDI (including the additional
identifiers attached to suppressed charges), and the smaller number of snz_uids in the IDI
dataset following the combining of identities.
Figure 3: Indexed comparison of number of identities using master_prn, master_prn with
suppression applied and snz_uid, 2008 - 2016
1.03
relative to MOJ master_prn
1.02
1.01
1.00
0.99
0.98
0.97
2008 2009 2010 2011 2012 2013 2014 2015 2016
MOJ master_prn MOJ master_prn (including suppressed master_prns) snz_uids
Cases are a more complex measure, as charges may be grouped into a case and can be
regrouped (cases may be split or merged) as the case progresses through the system. MOJ
recommends counting charges or people/identities rather than cases, for most analyses. (If
you think you need to use cases, please contact justiceinfo@justice.govt.nz to discuss.)
Business rules: MOJ has established business rules for extracting many different types of
data, including charge outcome, offence, and sentence types which are not all included in
this data dictionary. We encourage IDI users to contact justiceinfo@justice.govt.nz to
discuss what is available or how to best approach their analysis.
Methodology
Type of data: administrative data captured from the CMS used by the courts.
Data collector: Court staff enter updates into the CMS. Sector Group within MOJ produces
an updated summary dataset on a six-monthly basis for analysis purposes, which covers
charges that were resolved by the end of the previous calendar or financial year.
Mode of data collection: typed by Court staff into the CMS.
Frequency of data collection: near-daily (whenever courts are operating).
Quality information
Editing: Some people use aliases and therefore have multiple identifiers on the system.
MOJ tries to correct this and assign them to a master identifier (master_prn). Person
record number (PRN) is the identifier used across the sector by MOJ, Police, and the
Department of Corrections. The PRN aims to be unique, but duplicates are far from rare.
9IDI data dictionary: Ministry of Justice data (June 2018 edition)
Some of the multiple identities are resolved with the probabilistic linking to link this data
to the IDI spine, but not all.
Missing data: In general, there is relatively little missing data. All records have a master
identifier (master_prn), although for a small number of records this identifier is a driver
licence number rather than a PRN. Name information is missing for a small number of
Date of birth is not present for
approximately 90,000 person identities across the dataset which means these records
cannot be linked to the IDI spine. Records for organisations are also not linked to the IDI
spine.
Charges with suppression do not have any linking information (name or date of birth)
provided to Stats NZ (see Linking for further detail on name suppression).
Data quality: The Law Enforcement System (LES; formerly known as the Wanganui
Computer, and used by justice agencies from the late 1970s) was used as the source of this
data until 2003. From 2004, the data has been sourced from MOJ's CMS. Several data codes
changed around this time. Analysis of trends commonly shows a discontinuity around
2003/2004. In most instances, this discontinuity is probably due to the change in computer
system.
From 29 April 2016, courts data was sourced from the new Enterprise Data Warehouse
(EDW), rather than the justice sector data warehouse (ISIS) used since the introduction of
CMS. Changes in data processing may cause small differences if you compare current
output with similar results produced before 29 April 2016.
Linking
Court suppression orders prevent linking of some identities
The approved Information Sharing Agreement for sharing permitted information with
Statistics New Zealand (PDF) (AISA) for the sharing of Courts data with the IDI specifies that
suppressed information will not be shared. For example, name suppression means that a
information with the IDI is a form of publication). Name suppression can be given for
several reasons. The reasons that affect this dataset include defendants in specific sexual
cases (with the aim of protecting the victim) and defendants connected to the proceedings
at the discretion of the court.
To comply with the AISA, MOJ has not provided the link between the suppressed charge
and the person/identity, in instances where a final suppression order exists in relation to
-
suppressed) charges has been shared. This means the charge information is in this dataset,
but there will be no link for that person to the IDI spine.
About 2 percent of all charges are suppressed. However, there is the possibility of
substantial bias with analysis of data linked to some people due to suppression. For
example, there is a high concentration of suppression orders for people charged with
homicide or sexual offences; in 2017, 25 percent of charges for homicide and 26 percent of
charges for sexual offence had suppression (as seen in Figure 4).
10IDI data dictionary: Ministry of Justice data (June 2018 edition)
Figure 4: Percentage of charges in each ANZSOC division with suppression, 1992–2017
45%
40%
35%
% of suppressed charges
30%
25%
20%
15%
10%
5%
0%
01: Homicide 02: Acts intended to cause injury 03: Sexual assault
04: Dangerous or negligent acts 05: Abduction, harassment 06: Robbery, extortion
07: Unlawful entry/burglary 08: Theft 09: Fraud, deception
10: Illicit drug 11: Weapons and explosives 12: Property damage
13: Public order 14: Traffic 15: Offences against justice
16: Miscellaneous
Some charge outcome types also have higher levels of suppression, in particular those
the highest
proportions of suppression are:
• Acquitted insane (19 percent)
• Stay of proceedings (17 percent)
• Not proceeded with (16 percent)
• Unfit to stand trial (15 percent)
• Acquitted (not guilty) (12 percent)
• Not proceeded with on indictment (11 percent)
• Discharged s347(1) Crimes Act (11 percent).
Use the is_suppressed_charge variable to check whether records do not link to the IDI
spine because of suppression, or some other reason (such as being an organisation or
being a poor-quality match).
Linking of identities to the IDI spine
MOJ identities are linked with other justice sector (Police and Corrections) identities,
where the snz_jus_uid is assigned, before being linked to the IDI spine. Our data includes
records for organisations, but only records for people can be linked to the spine.
In earlier refreshes:
• charges for organisations were not provided to Stats NZ (prior to June 2017 refresh); all
charges for people and organisations are now provided
• while the details of charges between 1992 and 2004 were provided to Stats NZ, the
names of the people/organisations charged during this period were not always
11IDI data dictionary: Ministry of Justice data (June 2018 edition)
migrated into CMS from the LES case management system. The names associated with
almost all these charges have now been provided to Stats NZ (since the March 2018
refresh).
Users of this data need to be aware that linkage rates do vary over time. Figure 5 shows
that 66 percent of identities charged in 1992/1993 and not charged again since were linked
to the spine. This increased to 97 percent of identities whose most recent charge was in
2016/2017. While the linkage for the earlier years of the dataset (prior to 2004) is still lower,
it has improved substantially compared to previous refreshes with the provision of
additional name information from the LES case management system.
Note: The results in this paper are not official statistics They have been created for
research purposes from the Integrated Data Infrastructure (IDI), managed by Stat NZ. The
opinions, findings, recommendations, and conclusions expressed in this paper are those of
the author(s), not Stat NZ. Access to the anonymised data used in this study was provided
by Stats NZ under the security and confidentiality provisions of the Statistics Act 1975.
Only people authorised by the Statistics Act 1975 are allowed to see data about a particular
person, household, business, or organisation, and the results in this paper have been
confidentialised to protect these groups from identification and to keep their data safe.
Careful consideration has been given to the privacy, security, and confidentiality issues
associated with using administrative and survey data in the IDI. Further detail can be found
in the Privacy impact assessment for the Integrated Data Infrastructure available from
www.stats.govt.nz.
Figure 5: Percentage of identities linked to the IDI spine (March 2018 refresh), by person
indicator and most recent charge outcome year, 1992/1993 – 2016/2017
100%
90%
80%
70%
% link rate
60%
50%
40%
30%
20%
10%
0%
Linked, person Not linked, not person Not linked, person Linked, not person
There are several possible reasons for identities not linking to the IDI spine:
• identities that are provided as organisations are not linked to the IDI spine. The number
of identities each year that are recorded in MOJ data as organisations is small, and has
decreased over time (from 1.2 percent in 1992 to 0.3 percent in 2017).
(Note that there are several hundred identities that have snz_person_ind = 0 (not a
person) but are linked to the spine (have snz_spine_ind = 1). The person indicator is
there are
distinguish between businesses and people. There are only two datasets in the IDI with
records for businesses (MOJ and IR). If a MOJ/justice sector identity has a populated
birth date, gender and name but only links to the IR data it is only considered to be a
12IDI data dictionary: Ministry of Justice data (June 2018 edition)
person if it is in the IR EMS, IR4 and IR20 tables; otherwise the record is classified as not
a person (despite being linked to the spine and having other person characteristics).
• identities that do not contain all the matching requirements are also not linked to the
IDI spine. There is no attempt to link identities if they are missing date of birth (there are
approximately 90,000 people with no date of birth, especially in the earlier years) and at
least two of gender (includes where gender is X this was 1 percent in the early 1990s
and has remained low at about 0.2 percent since) or last name or first name. Records
with incomplete information were more frequent in the earlier years of the dataset.
in Figure 5. Note that these
required to be treated as a person for linking.
• identities with demographic information may not link to identities on the IDI spine due
to incorrect or poor-quality linking information, or because the identities only exist in
the MOJ or other justice sector (Police and Corrections) datasets.
The charge link rate has also improved over the period of the data. As seen in Figure 6, for
the March 2018 refresh, 96 percent of charges were linked in 2016/2017, 2 percent were
suppressed (and so could not be linked) and 2 percent did not link. In comparison, in
1992/1993, 90 percent of charges linked, 1 percent were suppressed and 9 percent did not
link.
Figure 6: Percentage of charges with identities linked to the IDI spine (March 2018
refresh), 1992/1993 – 2016/2017
100%
90%
80%
70%
% link rate
60%
50%
40%
30%
20%
10%
0%
Linked Not linked, not suppressed Not linked, suppressed
Privacy, security, or confidentiality issues
Privacy, security, and confidentiality issues are covered by the IDI Confidentiality Rules,
which can be found in the Microdata output guide.
Following the IDI privacy and confidentiality rules, all personal identifiers in the IDI have
been encrypted.
13IDI data dictionary: Ministry of Justice data (June 2018 edition)
Dataset description
Contents of dataset: the charges table is the core table with data about charges in court
(CHARGES).
Summary table
IDI variable name Primary Man- Format Classification Source variable
key datory name name
snz_uid Y N
snz_jus_uid Y N
snz_moj_charge_uid Y N CRN
moj_chg_charge_outcome_typ Y 6A charge_outcom charge_outcome
e_code e_type_code _type_code
moj_chg_offence_code Y 5A offence offence_code
moj_chg_prosecuting_agency_ Y 8A prosecuting_ag prosecuting_age
id_code ency ncy_id
moj_chg_plea_type_code 1A plea plea_type_code
moj_chg_sex_code 1A gender_code
moj_chg_age_nbr Y N age
moj_chg_charge_outcome_cye Y 4N charge_outcome
ar_nbr _calyear
moj_chg_charge_outcome_fye Y 4N charge_outcome
ar_nbr _finyear
moj_chg_first_court_id_code 3A court_id first_court_id
moj_chg_last_court_id_code 3A court_id last_court_id
moj_chg_serious_cyear_rank_ Y 4N rank_all_calyr
nbr
moj_chg_serious_fyear_rank_n Y 4N rank_all_finyr
br
moj_chg_YC_ind Y 1N YC_indicator
moj_chg_serious_sentence1_c 8A sentence Sent1
ode
moj_chg_serious_sentence2_c 8A sentence Sent2
ode
moj_chg_serious_sentence3_c 8A sentence Sent3
ode
moj_chg_serious_sentence4_c 8A sentence Sent4
ode
moj_chg_serious_sentence5_c 8A sentence Sent5
ode
14IDI data dictionary: Ministry of Justice data (June 2018 edition)
IDI variable name Primary Man- Format Classification Source Variable
key datory name name
moj_chg_sentence1_w 8N Swgt1
eight_nbr
moj_chg_sentence2_w 8N Swgt2
eight_nbr
moj_chg_sentence3_w 8N Swgt3
eight_nbr
moj_chg_sentence4_w 8N Swgt4
eight_nbr
moj_chg_sentence5_w 8N Swgt5
eight_nbr
moj_chg_offence_from YYYY-MM- offence_from_date
_date DD
moj_chg_charge_laid_ YYYY-MM- charge_laid_date
date DD
moj_chg_first_court_h YYYY-MM- first_court_hearing_d
earing_date DD ate
moj_chg_last_court_he YYYY-MM- last_court_hearing_d
aring_date DD ate
moj_chg_charge_outco YYYY-MM- charge_outcome_dat
me_date DD e
moj_chg_remand_bail_ 8A remand_bail
code
moj__chg_remand_cus 5A remand_custody
tody_code
moj_chg_suppressed_c 1A suppressed_charge
harge_code
moj_chg_master_prn_c Y 11A master_prn
ode
15IDI data dictionary: Ministry of Justice data (June 2018 edition)
Dataset variables
Variable name Definition Source
agency
variable
name
snz_uid Global unique identifier created by Stats NZ. There is an snz_uid for each distinct identity in the IDI. This
identifier is changed and reassigned each refresh.
snz_jus_uid Local unique identifier derived
Police, Justice, and Corrections data are linked together to create the snz_jus_uid before this is linked to the
spine. In most, but not all, instances this identifier will remain the same for an identity across refreshes. Where
Stats NZ receives more information during a subsequent refresh that indicates that two or more identities
represent the same identity, the identifier may change.
snz_moj_charge_uid Unique record number for each charge. Stats NZ encrypts this variable. CRN
moj_chg_charge_outcome_ Shows the outcome of the charge (eg convicted or acquitted). charge
type_code outcome
Use the charge_outcome metadata table to categorise charge outcomes. This provides a label/classification type code
describing each code and links each code to broader categories:
• outcome_name name description of outcome
• outcome_4cat - used most often, as it is the simplest categorisation to determine convicted charges, or
those with proved outcomes (by grouping together 1Convicted and 2Other proved)
• outcome_6cat used when specific outcome types are required (eg Youth Court proved or Discharge
without conviction)
• outcome_11cat - used when specific outcome types are required (eg Dismissed)
• outcome_youth used when using data relating specially to children and young people
• outcome_rank ranks outcome type codes by seriousness is used in the calculation of
moj_chg_serious_cyear_rank_nbr and moj_chg_serious_fyear_rank_nbr
16IDI data dictionary: Ministry of Justice data (June 2018 edition)
• outcome_notes provides information on when the outcome type was in use, based on changes in
legislation or case management systems (eg prior to the introduction of CMS in 2004, or since the Criminal
Procedure Act in 2013).
moj_chg_offence_code Code for each type of offence under New Zealand law. More detailed than the Australian and New Zealand offence code
Standard Offence Classification (ANZSOC) coding scheme, which is used for most statistical analysis (this
categorises offences into 16 divisions, then subdivisions and groups (the ANZSOC code is the group code)
more information is available from: http://abs.gov.au/ausstats/abs@.nsf/mf/1234.0)).
Use the offence_code metadata table to identify and categorise offences. The offence_code metadata table is
updated with each IDI refresh as new offences are added or other updates are made.
Note that there are some differences between the MOJ and Police offence code tables, particularly in the
ANZSOC mapping. A project is underway to better align all offence code information. As a result, we have made
some improvements and changed the ANZSOC mapping of some offences in the MOJ table. The most
significant of these is moving offences related to child pornography to 0322: Child pornography offences, and
the mapping of two cannabis equipment offences which changed from possession/use to manufacture/supply.
• offencedescription - name of each offence
• offence_code_description - combination of offence code and offence description
• effectivedate - date offence was introduced if known)
• obsoletedate date offence became obsolete as required and if known
• LegReference - section of legislation related to the offence and penalties
• ANZSOC (ANZSOC group code)
• IndividualMaxFine - maximum $ amount of fine that can be imposed
• CorporateMaxFine - maximum $ amount of fine that can be imposed
• MaxDays - maximum prison days that can be imposed
• MaxMonths - maximum prison months that can be imposed
• MaxYears - maximum prison years that can be imposed
• Max_is_range if maximum sentence is a range then = Y. For these the lowest value is included (eg
methamphetamine/amphetamine offences list the maximum sentence for amphetamine rather than the
higher sentence for methamphetamine)
• SeriousnessScore - way of quantifying the relative seriousness of offences based on the sentences imposed
for each offence. For more information refer to: Justice Sector Seriousness Score (2016 update): FAQs (PDF,
9pp)
• is_imprisonable - indicator of whether the offence is imprisonable or not
• ANZSOC_div_code_name - combination of division code and name
17IDI data dictionary: Ministry of Justice data (June 2018 edition)
• ANZSOC_subdiv_code_name - combination of subdivision code and name
• ANZSOC_group_code_name - combination of group code and name
• ANZSOC_div_code - 2 digit division code
• ANZSOC_subdiv_code - 3 digit subdivision code
• ANZSOC_div_name - division name
• ANZSOC_div_short_name - shortened version of division name (eg Homicide)
• ANZSOC_subdiv_name - subdivision name
• ANZSOC_group_name - group name
• offence_category_code - the Criminal Procedure Act 2011 categories offences in to four categories of
increasing length of maximum sentences: C1, C2, C3 or C4 see s6 of the Act for more information
http://www.legislation.govt.nz/act/public/2011/0081/latest/DLM3360039.html. Category 3 (and category 4)
offences are often used operationally as an indicator of serious offences, as these are punishable by
imprisonment of 2 years or more.
• is_sexual_violence_flg indicates whether the offence is in ANZSOC division 03: Sexual assault and related
offences
• sexual_violence_desc categorises the sexual offences by the type of victim, based on keywords in the
offence description: Children (under 16 years), Adult females (16 years and over), Adult males (16 years and
over, Unknown age and/or gender. More information can be found in the data table definitions on the MOJ
website: https://justice.govt.nz/justice-sector-policy/research-data/justice-statistics/data-tables/#sexual.
• sexual_violence_sort provides a sort order for outputting the sexual_violence_desc variable
• is_driving_under_influence_flg indicates whether the offence is a driving under the influence offence.
These include ANZSOC groups 0132: Driving causing death (which relate to alcohol or drugs), 0411: Driving
under the influence of alcohol or other substance and 1431: Exceed the prescribed content of alcohol or
other substance limit. Driving under the influence offences include driving under the influence of alcohol
and/or drugs (approximately 3% of charges in 2017 were specifically for drug offences, less than 1% for
alcohol and/or drugs, and 97% were specifically for alcohol offences up until 2015, 1% of charges were
specifically for drug offences). More information can be found in the data table definitions on the MOJ
website: https://justice.govt.nz/justice-sector-policy/research-data/justice-statistics/data-tables/#driving-
under-the-influence.
• is_drug_cannabis indicates whether the offence is a cannabis offence (in division 10: Illicit drug offences).
More information can be found in the data table definitions on the MOJ website:
https://justice.govt.nz/justice-sector-policy/research-data/justice-statistics/data-tables/#drug-offences.
• is_drug_meth - indicates whether the offence is a methamphetamine offence (in division 10: Illicit drug
offences). Note that methamphetamine offences include methamphetamine and amphetamine. More
18IDI data dictionary: Ministry of Justice data (June 2018 edition)
information can be found in the data table definitions on the MOJ website: https://justice.govt.nz/justice-
sector-policy/research-data/justice-statistics/data-tables/#drug-offences.
• is_family_violence i s. Family
violence offending can be covered by a range of different offences that are not easily identifiable as
involving family violence in MOJ data, or could involve a non-family violence situation. To be able to
produce statistics that are robust over time MOJ uses Breach of protection order, Common assault
(domestic), and Male assaults female to represent family violence (more than 90% of these charges involve
family members). As such, these do not include offending which is charged under different offence types,
including more serious offences such as homicide. More information can be found in the data table
definitions on the MOJ website: https://justice.govt.nz/justice-sector-policy/research-data/justice-
statistics/data-tables/#family-violence-offences.
• family_violence_desc categorises the three family violence offences: Breach of protection order, Common
assault (domestic) and Male assaults female.
• offence_typology_desc categorises offences into the typology used by project formerly known as the
Investment Approach to Justice (eg High acquisitive etc).
moj_chg_prosecuting_agen Identifies the agency that filed and prosecuted the charge. Prosecuting
cy_id_code agency
Police file most charges (typically around 80% of charges filed each year), but others are filed by the
Department of Corrections (10%), local authorities, Crown Law, the Ministry of Social Development, Inland
Revenue, the Ministry for Primary Industries, MBIE, NZ Customs, Serious Fraud Office, Worksafe NZ, Department
of Conservation and other agencies with enforcement responsibilities.
Use the prosecuting_agency metadata table to identify these agencies, including:
• prosecuting_agency_name
• prosecuting_agency_type (such as Police, Crown, Local Authorities etc).
moj_chg_plea_type_code Shows the final plea made by the defendant in relation to the charge (eg guilty, not guilty, or no plea). plea type
code
Use the plea_type metadata table to identify these types, including:
• plea_name
• plea_type (Guilty, Not guilty, Not recorded/No plea).
moj_chg_sex_code Shows the gender of the person or whether the identity is an organisation. gender code
19IDI data dictionary: Ministry of Justice data (June 2018 edition)
Use the gender_code metadata table to identify these.
moj_chg_age_nbr Age of the person in years at the time of the alleged offence where possible (or a proxy date, for example, charge age
laid date or first court hearing date, where offence date is unavailable).
Values below 10 are invalid. In some instances where the date of birth and the offence date exist
calculated age is invalid, and populated as missing. This occurs when:
• age has been calculated to be less than 10 years
• calculated ages and criteria for Youth Court eligibility do not correspond.
Because of this, care should be taken if age is recalculated for age at a different reference date (eg charge
outcome date) using date of birth.
Whilst 5- or 10-year age bands are generally used for analysis, modification is required for children and young
people. Generally, categories of 10 16 years (or 10 13 and 14 16 years) and 17 19 years are used.
moj_chg_charge_outcome_ NOTE: in the March 2018 refresh the moj_chg_charge_outcome_cyear_nbr and charge
calyear_nbr moj_chg_charge_outcome_fyear_nbr variables have been mislabelled at load time (calendar values show outcome
as 2008/2009 rather than 2008, and vice versa). With this refresh, use year
moj_chg_charge_outcome_ moj_chg_charge_outcome_fyear_nbr to get calendar years, and moj_chg_charge_outcome_cyear_nbr to
finyear_nbr obtain financial years.
Shows the year in which the charge was disposed, based on the charge outcome date. There are now year
variables for calendar years and financial years. Note that the dataset begins in January 1992 so 1991/1992 will
be a part year. Also note whether the end point of the dataset is in June or December. Ensure that the
corresponding calendar year or financial year rank variable is used.
This variable is used as the basis for most analysis. Where charge_outcome_date is missing, but an outcome has
been recorded, the year from last_court_hearing_date is used instead. Doing this results in no missing values in
charge_outcome_year (despite missing values in charge_outcome_date).
moj_chg_first_court_id_cod Identifies the court where the first hearing was held usually closest to where the crime took place. first court id
e
Use the court metadata table to identify and categorise courts:
20IDI data dictionary: Ministry of Justice data (June 2018 edition)
• court_name (eg Palmerston North District Court)
• court_location includes all court types at each location (eg Palmerston North, includes Palmerston North
Youth Court, District Court, and High Court)
• court_type (eg District Court)
• court_JSA the Justice Service Area is an administrative grouping of courts (eg Taitokerau,
Manawatu/Wairarapa). There are 16 JSAs. Up until June 2017 courts were grouped into 14 Service delivery
areas (JSAs differ from SDAs by changes to some of their names, Wellington SDA being split into two JSAs
and the addition of a Central Registry JSA). Prior to Service delivery areas slightly different groupings called
court were used.
• court_JSA_sort_order JSAs are ordered geographically from north to south
• police_district - mapping of court location to Police District (eg Central)
• police_district_sort_order - Police Districts are ordered geographically from north to south
• date_closed - date the court closed (Note that when the Upper Hutt and Lower Hutt courts closed in 2013,
they were combined into Hutt Valley Court). There are some records where the court hearing is recorded at a
court that has closed several years previously.
moj_chg_last_court_id_cod Identifies the court where the last hearing was held. last court id
e
Use the court metadata table to identify and categorise courts.
moj_chg_serious_cyear_ran Ranks the charges, by seriousness, for each identity in a given calendar or financial year (based on charge rank all
k_nbr outcome date). Note that the first or last year in the dataset may only be a half year.
moj_chg_serious_fyear_ran A value of indicates this was the most serious charge for that identity in that year. A range of information is
k_nbr used to determine which charge is an most serious in a year. This includes information such as charge
outcome, sentence type, sentence length/amount, remands in custody, and bail and maximum offence
penalties. For further information on rank all, see Coverage: most serious charge per person per year.
moj_chg_serious_year_rank_nbr is calculated at MOJ prior to transfer to Stats NZ. This is based on an
PRN (the justice sector unique identifier). However, the linking process can combine several PRN identities into
a single snz_jus_uid, which would mean that an identity could then have duplicate
moj_chg_serious_year_rank_nbr values.
21IDI data dictionary: Ministry of Justice data (June 2018 edition)
moj_chg_YC_indicator Indicator of whether the charge was heard in the Youth Court. See Children and young people: they are usually Youth Court
handled very differently for more about Youth Court. indicator
A charge is determined to have been heard in the Youth Court if the person is aged 10 16 years and the charge
outcome court is a Youth Court. This allows charges transferred to the District Court for sentencing (which can
occur for more serious types of offending) to be counted in the Youth Court where the charges were heard.
moj_chg_serious_sentence Most serious sentence imposed for that charge. sent1
1_code
A convicted charge may have more than one sentence imposed. As such, up to 5 sentences per charge are
included in the dataset. These are ranked in order of seriousness, from moj_chg_serious_sentence1_code to
moj_chg_serious_sentence5_code.
Use the sentence metadata table to identify and categorise these. Any analysis of sentences needs to first limit
the data to CONVICTED charges. This is because charge outcome types other than convicted can receive
sentences and orders. For example, an identity discharged without conviction, although not convicted, may be
ordered to pay reparation, be disqualified from driving, or receive a Sentencing Act Final Protection Order.
Failure to exclude non-convicted charges from sentence analysis will produce potentially misleadingly results
The hierarchy used to rank sent1 sent5 is shown (broadly) by the alphabetical order of sent_10cat (the 10
categories included in the metadata sentence table). These are:
a. imprisonment
b. -2004 data)
c. community detention
d. intensive supervision
e. community work
f. supervision
g.
h. deferment
i. other
j. no sentence recorded.
The sent_10cat is the preferred categorisation for sentence analysis.
22IDI data dictionary: Ministry of Justice data (June 2018 edition)
An alternative is to use sent_6cat where less differentiation of community or other sentence types is required.
However, n -2004 sentences grouped together
with Home detention in sent_10cat.
1. -2004)
2. home detention
3. other community
4. monetary
5. other
6. no sentence recorded.
There is also a youth specific categorisation sent_youth. This orders penalties based on the section 283
Oranga Tamariki Act 1989 hierarchy of responses:
• s283(a)-(b) discharge, admonish
• s283(c)-(j) monetary, confiscation, disqualification
• s283(ja)-(jc) education, rehab programmes
• s283(k)-(l) youth supervision, community work
• s283(m) supervision with activity, intensive supervision
• s283(n) supervision with residence
• s283(o) adult sentences.
A combination of sent_youth and sent_10cat can be used, for example, to detail the types of adult sentence (eg
imprisonment) received.
Care should also be taken when applying the sentence categorisation to the 2nd, 3rd, 4th,and 5th most serious
sentences. The categorisation should not be applied to charges with no values for each of these variables eg
case when sent2 = ' ' then ' '
else sentence_10cat
end as sentence2
Note the hierarchy of sentence types used by MOJ differs in two ways from the hierarchy of sentence types used
by Corrections:
• MOJ ranks community detention as more serious than intensive supervision.
• MOJ ranks community work as more serious than supervision.
23IDI data dictionary: Ministry of Justice data (June 2018 edition)
These differences mean analyses of most serious sentence using MOJ data may show higher counts of
community work relative to supervision, and higher counts of community detention relative to intensive
supervision, than parallel analysis using Corrections data. No legislative guidance advises which ranking is
continue rather than either agency needing to change. Changing either sentence hierarchy now would create
inconsistency with previously published output dependent on the hierarchy of sentences.
moj_chg_serious_sentence Identifies the second-most serious sentence for the charge if there was one, and is otherwise blank. sent2
2_code
moj_chg_serious_sentence Identifies the third-most serious sentence for the charge if there was one, and is otherwise blank. sent3
3_code
moj_chg_serious_sentence Identifies the fourth-most serious sentence for the charge if there was one, and is otherwise blank. sent4
4_code
moj_chg_serious_sentence Identifies the fifth-most serious sentence for the charge if there was one, and is otherwise blank. sent5
5_code
moj_chg_sentence1_weight Shows the weight (length/amount) of the sentence in the sent1 variable. swgt1
_nbr
The unit varies by sentence type:
Sentence type Unit
Imprisonment days
Home detention days
Community detention days
Intensive supervision days
Community work hours
Supervision days
Monetary (fine, reparation) $
Deferment days
24IDI data dictionary: Ministry of Justice data (June 2018 edition)
Other -
No sentence recorded -
Order and Order for forfeiture). Care should also be taken with the obsolete sentence type Community service
(CS) which has no sentence weight (and so averages should not be calculated for Community work sentences
prior to 2003 when this sentence type was replaced in the Sentencing Act 2002).
indeterminate sentences any minimum non-parole period imposed needs to be considered. However, this
information is not currently included in the IDI dataset. Whilst not ideal, in its absence the minimum non-parole
period can be estimated:
• Life imprisonment = 10*365*1.5 (where 10 years is the minimum non-parole period * 365 days * a factor of
1.5)
• Preventive detention = 5 years is the minimum non-parole period * 365 days * a factor of 1.5).
moj_chg_sentence2_weight Shows the weight of the sentence in the sent2 variable. The unit varies by sentence type (as for sent1). swgt2
_nbr
moj_chg_sentence3_weight Shows the weight of the sentence in the sent3 variable. The unit varies by sentence type (as for sent1). swgt3
_nbr
moj_chg_sentence4_weight Shows the weight of the sentence in the sent4 variable. The unit varies by sentence type (as for sent1). swgt4
_nbr
moj_chg_sentence5_weight Shows the weight of the sentence in the sent5 variable. The unit varies by sentence type (as for sent1). swgt5
_nbr
moj_chg_offence_from_dat Date on which the alleged offence occurred. This may be many years prior to when the charge was filed. offence from
e date
moj_chg_charge_laid_date Date on which the charge was filed by the prosecuting authority (or closest proxy). charge laid
date
25IDI data dictionary: Ministry of Justice data (June 2018 edition)
moj_chg_first_court_hearin Date of the first court hearing associated with the charge. first court
g_date hearing date
moj_chg_last_court_hearin Date of the last court hearing associated with the charge. last court
g_date hearing date
moj_chg_charge_outcome_ Date of the charge outcome (or closest proxy). charge
date outcome
date
moj_chg_remand_bail_cod Provides information on whether the person had been remanded on bail (including EM bail) at any point during bail
e the charge. This does not provide information on the total time, or individual bail periods. It also does not
include Police bail, which occurs prior to the first court appearance.
If a charge is not going to be resolved at the first court appearance, the court will have to decide whether to
hold the person in custody, or whether to release them until their next court appearance. Granting the person
bail means the court will release the person on conditions, including that they return to court for their next
required appearance.
The variable contains the following categories (in order of priority if a person was remanded on more than one
type of bail at different points during the charge):
• EM bail this is electronically monitored bail
• bail includes on bail, on bail for psychiatric report and bail deferred sentence
• at large this is release without conditions
• none where the person has not been remanded on bail, but has had a different remand type such as
remand in custody or an administrative adjournment to a later court date
• missing where the person has not been remanded on bail or remanded in custody; their charge was most
likely finalised on the day.
For charges finalised at the first court appearance there is no opportunity for bail. A person may have spent
time remanded on bail and time remanded in custody for the same charge.
court.
26IDI data dictionary: Ministry of Justice data (June 2018 edition)
moj__chg_remand_custody Provides information on whether the person had been remanded in custody at any point during the charge. This custrem
_code does not provide information on the total time remanded in custody, or individual remand periods
Corrections data is a better source for detailed remand in custody information.
The variable contains the following categories (in order of priority if a person was remanded on more than one
type of custody at different points during the charge):
• adult includes in custody, in custody for psychiatric report, in custody s121 Criminal Justice Act 1985
(which was replaced by the Criminal Procedure (Mentally Impaired Persons) Act 2003
• child includes CYPFA custody s238(1)(d), Police Custody s238(1)(e), Custody - Other s238(1)(c)
• none where the person has not been remanded in custody, but has had a different remand type such as
bail or an administrative adjournment to a later court date
• missing where the person has not been remanded on bail or remanded in custody; their charge were
most likely finalised on the day.
For charges finalised at the first court appearance there is no opportunity for remand in custody. A person may
have spent time remanded on bail and time remanded in custody for the same charge.
moj_chg_suppressed_charg Provides an indicator of whether the charge had suppressed information, which would explain why the identity suppressed
e_code could not be linked to the IDI spine.
Refer to Linking for further detail on suppression.
moj_chg_master_prn_code Person identifier which is intended to be unique for any person. PRN
27IDI data dictionary: Ministry of Justice data (June 2018 edition)
Helpful information for analysis
What is the justice sector?
The justice sector includes the Ministry of Justice, New Zealand Police, the Department of
Corrections, the Crown Law Office, the Serious Fraud Office, and Ministry of Children
Oranga Tamariki (formally Child Youth and Family). The sector collaborates to reduce
crime and enhance public safety; and to provide access to justice by delivering modern,
effective and affordable services.
Other justice sector datasets in the IDI include:
• Department of Corrections (Sentencing and remand data)
• Oranga Tamariki
• NZ Police (Recorded Crime Offender Statistics; Recorded Crime Victim Statistics).
About how many people should I find in justice datasets?
It often helps to think of the criminal justice sector as a pipeline: offenders typically enter
by dealing with Police, progress through the court, and some end with a Corrections-
managed sentence. Figure 7 shows numbers of identities in various parts of the pipeline in
2016 (with sentences assigned by seriousness hierarchy so a person is counted only once).
Figure 7: Number of distinct identities passing selected points in the criminal justice
pipeline, year to December 2016
Prosecuted by
other agencies
3,000
Prison 9,000
Prosecuted by Remand
Corrections 11,000
5,000
Community total
Community 29,000
27,000
Convicted Sentenced
Fine remittal
Prosecuted Total 65,000 62,000 2,000
by Police prosecuted
81,000 Fine imposed Fine paid
Offenders 73,000 23,000 21,000
proceeded
against by Other 3,000
Police Convicted &
discharged
112,000
3,000
Not convicted
16,000
No sentence
58,000
Non-court proceedings
39,000
You can check numbers of charges and number of people on NZ.Stat but note that there
are separate table for children and young people and for adults 17 years and older.
28IDI data dictionary: Ministry of Justice data (June 2018 edition)
Much crime is not reported to, or detected by Police, and so is
not recorded in administrative data.
The New Zealand Crime and Safety Survey (NZCASS) was a large nationwide survey of
residents aged 15 years and over. This survey results show that 69 percent of in-scope
offences were not reported to Police, and this is higher still for certain types of offences.
For further detail see the NZ Crime and Safety Survey main findings report.
This means a large proportion of offending is not known to Police, which flows through to
charges in court. Keep this missing data in mind when undertaking analysis.
Implication:
necessarily showing increased occurrences of such crimes. The public may have been
encouraged to increase reporting of that crime type or Police may have increased effort
placed on detecting/discovering those offences.
Only criminal court data is included in the court charges dataset
Court data supplied to the IDI is about criminal charges, so it excludes Family Court (which
handles some aspects of domestic violence and offending by children) and civil justice
(disputes that are not about breaking criminal law, eg disputes over business contracts or
between neighbours). Note: Section 14(e) of the Oranga Tamarki Act 1989 states that one
reason for defining a child as being in need of care or protection is serious concern for their
well-being due to the number, nature, or magnitude of their offending. Where this concern
exists, a declaration may be sought in the Family Court under s67 of the Act that the child is
in need or care or protection. An overview of different courts is on the Ministry of Justice
website.
What should you be particularly careful with?
Counting rules: Be aware these can have big impacts on the meaning of justice data.
Units: Be aware that very different results can result from different choices of unit.
Children and young people: they are usually handled very differently.
The youth justice system is very different to that for adults (ie aged 17 years or over). So,
the data is very different too. Specifically:
• The age of criminal responsibility in New Zealand is 10 (ie children under 10 years of age
cannot be prosecuted).
• New Zealand has separate justice processes for under 17-year-olds the child offending
process for 10- to 13-year-olds and the youth justice process for 14- to 16-year-olds.
Both processes have a dual focus on accountability and rehabilitation. Some youth
aged 17 years or older can be dealt with in the youth justice system if they were aged
under 17 years at the time they offended.
• Children (aged 10 13 years) cannot be prosecuted except for the offences of murder
and manslaughter, and for 12- and 13-year-olds, also a small number of other serious
offences in certain circumstances (following a legislative change from 1 October 2010).
29You can also read