# Using Gaia DR2 to Constrain Local Dark Matter Density and Thin Dark Disk - arXiv

←

**Page content transcription**

If your browser does not render page correctly, please read the page content below

Prepared for submission to JCAP Using Gaia DR2 to Constrain Local Dark Matter Density and Thin Dark Disk arXiv:1808.05603v1 [astro-ph.GA] 16 Aug 2018 a,1 Jatan Buch , Shing Chau (John) Leunga , and JiJi Fana a Department of Physics, Brown University, Providence, RI, 02912, USA E-mail: jatan_buch@brown.edu, shing_chau_leung@brown.edu, jiji_fan@brown.edu Abstract. We use stellar kinematics from the latest Gaia data release (DR2) to measure the local dark matter density ρDM in a heliocentric cylinder of radius R = 150 pc and half-height z = 200 pc. We also explore the prospect of using our analysis to estimate the DM density in local substructure by setting constraints on the surface density and scale height of a thin dark disk aligned with the baryonic disk and formed due to dark matter self-interaction. Performing the statistical analysis within a Bayesian framework for three types of tracers, we obtain ρDM = 0.023 ± 0.012 M /pc3 for A stars; early G stars give a similar result, while F stars yield a significantly higher value. For a thin dark disk, A stars set the strongest constraint: excluding surface densities (5-15) M /pc2 for scale heights below 100 pc with 95% confidence. Comparing our results with those derived using Tycho-Gaia Astrometric Solution (TGAS) data, we find that the uncertainty in our measurements of the local DM content is dominated by systematic errors that arise from assumptions of our kinematic analysis in the low z region. Furthermore, there will only be a marginal reduction in these uncertainties with more data in the Gaia era. We comment on the robustness of our method and discuss potential improvements for future work. 1 ORCID: https://orcid.org/0000-0001-6672-6750

Contents 1 Introduction 1 2 Data Selection 4 2.1 Selection Function 4 2.2 Vertical Number Density Distribution 5 2.3 Midplane Velocity Distribution 7 3 Poisson-Jeans Theory 8 4 Statistics and Data Analysis 12 4.1 Basic setup: Priors, likelihood, and uncertainties 12 4.2 Bayesian Analysis 14 5 Results and Discussion 14 5.1 Local DM Content Using Gaia DR2 15 5.1.1 Local DM Density 15 5.1.2 Constraints on a Thin DD 16 5.2 Comparison of Constraints between DR2 and TGAS 16 5.3 Possible Interpretation of Our Measurement of ρDM 19 6 Conclusions and Outlook 20 A Color-magnitude Modeling 21 B Uncertainty Analysis 22 C Variation of Midplane Cut 23 D Bootstrap Statistics 23 E Frequentist Analysis 24 1 Introduction The second release of data collected by European Space Agency’s Gaia telescope provides the positions and proper motions, with unprecedented precision, of more than one billion sources in the Milky Way (MW) [1–24]. With the release of line-of-sight velocities for about seven million stars, DR2 also allows, for the first time, a dynamical analysis with a self-consistent measurement of the 6D phase space for a stellar population. DR2 presents an exciting opportunity to use the vertical velocity and number density distributions of different populations of stars that trace the gravitational potential for pre- cisely determining the total matter density, including baryons and dark matter (DM), in the local solar neighborhood. Significant progress has been made in modeling the local baryon budget (interstellar gas, stars, stellar remnants) and its uncertainties [25–28] since Oort’s early estimate [29] of the baryon density. Meanwhile, kinematic methods for estimating the –1–

local DM density rely on constraining the total matter content using motions of tracers after assuming a model for the baryons and attributing any additional density, within uncertainty, to DM. These methods are based on: a) the Jeans analysis that reduces the collisionless Boltzmann equation for the phase space distribution function into a set of moment equations by integrating over all velocities, and b) the Poisson equation which uses the total matter density in all components to calculate the gravitational potential. In this work, we primar- ily focus on the 1D distribution function method developed by Refs. [30–34] and used by Refs. [35, 36] to constrain the local DM density with data from the Hipparcos satellite [37]. However, the approximations of isothermality and decoupling of radial and vertical motions in this method are only valid up to scale height z ∼1 kpc. Therefore, for using tracer data at high z, Refs. [38, 39] adopt the more general moment-based method to estimate the DM density. A non-parametric formulation of the moment-based method, described by Ref. [40] and imple- mented in Ref. [41], uses SDSS/SEGUE G stars in a heliocentric cylinder with R ∼1 kpc and 0.5 kpc

Selection function Color and volume cuts Negative parallax and midplane latitude cuts Radial velocity in DR2 No Yes Average radial velocity Eﬀective completeness Velocity distribution Density distribution from data Predicted density using PJ solver MCMC sampling of the posterior Figure 1: Flowchart of our analysis. their differences with results using TGAS in Section 5.2, and discuss their robustness in the context of our method in Section 5.3. We conclude and comment on future directions in Section 6. –3–

2 Data Selection Gaia DR2 contains ∼1.7 billion stars, among which ∼1.3 billion stars have a five-parameter astrometric solution: (α, δ, µαe , µδ , $), representing positions and proper motions along the right ascension and declination, and parallax respectively. We emphasize that for DR2, the parallaxes and proper motions are based solely on Gaia measurements, unlike DR1 which depends on the Tycho-2 Catalogue. DR2 also provides photometric data for a majority of its sources in three passbands, G, GBP , and GRP with 3 . G . 21. Another new feature in DR2 is the line-of-sight radial velocities, vR , for ∼7.2 million stars brighter than GRVS > 12. We refer readers to Refs. [3, 10, 18] for more details of the Gaia DR2 measurements and astrometric solution. However, Gaia DR2 is still volume incomplete for bright stars with G < 12. We use the Two Micron All-Sky Survey (2MASS ) catalog [68] to compute the sky completeness of DR2. 2MASS is a full-sky infrared astrometric and photometric survey and is 99% complete in the sky volume of our interest. It provides the angular position of the stars in the celestial sphere allowing for a cross-match with the Gaia catalog. It also provides the J and Ks magnitudes of each star, which we use to categorize the stars in our data sample. We query the Gaia archive1 for DR2 cross-matched with the 2MASS catalog, requiring that the apparent magnitude J < 14, which cuts away stars that are either too dim for the main sequence, or too distant from the Sun.2 The resulting cross-matched catalog contains ∼36 million stars. We then use the star counts in the complete 2MASS catalog for J < 14 to compute the effective volume completeness of DR2, as we discuss in the following section. 2.1 Selection Function 0 50 0 10 Number of good AL observations SD of good AL observations Figure 2: Skymaps showing the number (left) and variance (right) of good AL observations in 3.36 deg2 (Nside = 25 ) HEALPix pixels. The white regions are the “bad” parts of the sky, which do not pass our selection cuts defined in the main text. In absence of an official Gaia selection function, we employ the quality cuts used in Ref. [69] to identify the “good” part of the sky in the cross-matched data sample, (i) The mean number of along-scan (AL) observations ≥ 8.5, (ii) Spread in the number of AL observations ≤ 10. 1 https://gea.esac.esa.int/archive/ 2 In our selected volume, the apparent magnitude of all tracer stars satisfy J < 12. –4–

After these cuts, 95.6% of the sky remains with a mean parallax uncertainty < ∼ 0.1 mas. Although DR2 provides an order of magnitude improvement over TGAS in the uncertainties of astrometric parameters within our volume of interest, we still need to include its error budget in our data analysis. As suggested by Ref. [17], we add 0.05 mas to the reported parallax to account for the global offset. Following Ref. [18], we also add in quadrature a systematic uncertainty of ±0.1 mas and ±0.1 mas/yr to the reported values of parallax and proper motions respectively. We focus on the solar neighborhood and select tracer stars in a heliocentric cylinder with radius R = 150 pc (we discuss this choice in more detail in Section 2.2) and half-height z = 200 pc. An important factor in selecting ‘good’ tracers3 of the local galactic potential is their sensitivity to disequilibria. In particular, as concluded by Ref. [70], disequilibria could have disparate effects on different tracer subpopulations resulting in incompatible ρDM measurements. While there are conflicting views in the literature about which stars, old [35, 71] or young [41], are in dynamical equilibrium,4 we follow Ref. [41] in choosing younger A (A0-A9), F (F0-F9), and early G (G0-G3) dwarf stars (simply stars henceforth) in our analysis which have lower scale heights and consequently shorter equilibration timescales, instead of older stars.5 We use the color cuts introduced in Ref. [69] to define tracer populations based on different spectral subtypes (indicated in parentheses above). Other important criteria in the choice of tracers for our analysis are sufficient statistics and a reasonable change in the number densities within |z| ≤ 200 pc. Since Gaia DR2 is volume incomplete, the stellar density profile needs to be normalized appropriately. The color-magnitude dependent normalization, referred to as the effective volume completeness, is the ratio of DR2 number counts to those of the volume complete 2MASS catalog in our region of interest. We follow the method in Ref. [69] developed for the TGAS catalog and use the gaia_tools package6 to compute this quantity for DR2. In deriving the effective volume completeness, we also need to compute the selection function, which is the fraction of stars at a given (J, J −Ks , α, δ) in the DR2 catalog. In the gaia_tools package, the selection function is obtained by a spline interpolation of a modified magnitude in three J − Ks color bins. We vary the modeling of the selection function by increasing the number of color bins and find that the change in the resulting selection function and effective completeness is negligible. A more detailed analysis is presented in Appendix A. The effective completeness as a function of scale height for our DR2 tracer populations is plotted in Fig. 3. We also include the effective completeness for the TGAS data as a reference, and note that the DR2 sample is significantly more complete. 2.2 Vertical Number Density Distribution There are 4544 A, 38431 F, and 44075 early G stars in the solar neighborhood defined by our heliocentric cylinder. The volume complete vertical number density for each tracer, shown in Fig. 4, is obtained by dividing the number counts with the effective volume completeness in each z bin. We choose 20 pc as the bin size based on parallax uncertainties as discussed in Appendix B. Varying the bin size doesn’t significantly affect the results of our analysis. We 3 Sec. 3.6 of Ref. [45] gives a thorough, even if slightly outdated with the release of DR2, overview of important characteristics for the choices of tracers. 4 It will take a detailed study using N-body simulations to answer this question definitively, which is beyond the scope of this paper. 5 The velocity dispersions of stars increases with age due to scattering with structures in the MW. See Chapter 8.4 of Ref. [72] for a detailed discussion. 6 https://github.com/jobovy/gaia_tools –5–

R = 150 pc 1.0 0.8 Effective volume completeness 0.6 0.4 0.2 A stars F stars 0.0 Early G stars −200 −150 −100 −50 0 50 100 150 200 z [pc] Figure 3: Effective volume completeness of each stellar type. The completeness of DR2 (solid) is improved by a factor of ∼3 compared to that of TGAS (dashed) for A, F and early G stars. also present a comparison of star counts in the full volume and in the midplane (defined to be the region with b < 5o in the cylinder) between DR2 and TGAS in Table 1. There is roughly a factor ∼2 increase in the number of stars in both the full volume and midplane region for each tracer, leading to a ∼30% reduction in the statistical uncertainty due to Poisson error. Data set Gaia DR2 TGAS Type Subtype Total Midplane Total Midplane A A0-A9 4544 310 1729 182 F F0-F9 38431 2213 16789 1308 Early G G0-G3 44075 2166 18653 1205 Table 1: Star counts in DR2 and TGAS catalogs for the heliocentric cylinder and the midplane region (|b| < 5o ) inside it. The uncertainty in the star√number Nk in the k th z bin is obtained by adding in quadra- ture the statistical uncertainty Nk and a 3% systematic uncertainty due to dust extinction. We expect the dust extinction to be important in the visible spectrum such as the B and V colors used in Hipparcos catalog, or the GBP and GRP used in DR2. However, colors in the infrared spectrum, i.e. the J and Ks colors used in our cross-matched DR2-2MASS catalog, are associated with longer wavelengths and therefore less affected by galactic dust. Ref. [69] finds that the effect of dust reddening on the number density of stars in the solar neighborhood defined using J and Ks is . 3% and mostly affects the overall normalization. –6–

0.0 −0.5 −1.0 ln(ν/ν0) −1.5 −2.0 A stars F stars −2.5 Early G stars −200 −150 −100 −50 0 50 100 150 200 z [pc] Figure 4: Vertical number density profiles in z = 20 pc bins for A (blue), F (green), and early G (orange) stars. We notice that increasing the cylinder radius R from 150 pc to 200 pc results in an overall broadening of the tracers’ density distribution. This is similar to the broadening reported by Ref. [67] in the TGAS data. A broader density distribution could potentially lead to a much stronger constraint on the local DM content since additional matter tends to pinch the density distribution. Ref. [67] attributed the broadening to the so-called “Eddington” bias: higher parallax uncertainties of distant stars could lead to a smearing of the density distribution at large |z|. While this could be true for the TGAS catalog, the parallax uncertainties are significantly reduced in DR2 and remain small at large |z|: the average and 1σ variation of parallax uncertainty is below 10 pc (still smaller than the bin size 20 pc) at z = 200 pc in DR2, even when R is increased to 250 pc as shown in Fig. 13. Thus, it seems unlikely that the broadening of the density distribution is due to the “Eddington” bias. 2.3 Midplane Velocity Distribution The last ingredient we need from the data is the vertical velocity distribution in the midplane, i.e. at z = 0. The vertical velocity of a star is given by, κµb w=w + cos b + vR sin b, (2.1) $ where w is the Sun’s vertical velocity that we determine by fitting a Gaussian distribution to the data, κ = 4.74 km yr s −1 is a unit conversion constant, µb is the proper motion along the galactic latitude b in mas/yr, $ is the parallax in mas, and vR is the radial velocity in km/s. There are two options for defining the ‘midplane region’,7 imposing a cut on the height 7 At larger b and consequently larger z, the kinematically hotter stars broaden the distribution [35]. Mean- while, simply choosing stars with z = 0 yields poor statistics. –7–

|z|, or the galactic latitude |b|. Since, until the release of DR2, radial velocities have been only available for a subset of tracers, previous analyses chose a region with |b| 1 (in radians). With that choice, substituting vR by its mean value, hvR i = −u cos l cos b − v sin l sin b − w sin b, (2.2) where u = 11.1 ± 0.7stat ± 1.0sys km/s and v = 12.24 ± 0.47stat ± 2.0sys km/s [73], only has a subdominant contribution to w since sin b 1. We explore the possibility of using the z-cut [74] in Appendix C by including the newly measured radial velocities in DR2. Unfortunately, DR2 only contains radial velocities for approximately 2% of A stars, 53% of F stars, and 62% of early G stars for |z| < 20 pc. We check that the percentage of tracers with radial velocity doesn’t change significantly for higher values of z. In that case, only including stars with vR available could potentially introduce a selection bias, while approximating vR by its mean value might result in large errors at higher b (even at low z). Thus, defining the midplane region using a z-cut isn’t viable currently, but that could change with future data releases. We follow Ref. [67] in choosing |b| < 5◦ as our midplane cut. After imposing an additional cut to remove stars with negative parallaxes, we are left with 310, 2213 and 2166 A, F and early G stars respectively. The mean of the best fit Gaussian distributions to the midplane vertical velocity, weighted by the star counts of each tracer population in the midplane, is w = 6.9±0.2 km/s. We take this to be the Sun’s vertical velocity w and note that it is consistent within 1σ with the value in Ref. [73]. Subtracting w from the stars’ vertical velocity, we find the distributions are roughly symmetric about w = 0. The resultant normalized midplane vertical velocity distribution f0 (w) with a w-bin size of 1.5 km/s (see Appendix B for more details about this choice) is plotted in the left panel of Fig. 5. We consider the asymmetry between the star counts in −|w| and +|w| bins to be the systematic uncertainty, which may be due to non-equilibrium effects. We illustrate the magnitude of this uncertainty in the right panel of Fig. 5 by adding it in quadrature with the statistical error for every w bin. In practice, however, we propagate these errors into the uncertainty of the prediction density, as we elaborate in Sec. 4.1. We also check the isothermality of the tracers by fitting the midplane data with Gaussian distributions. From the fits, we find that the velocity dispersions σz are 6.1, 10.4, 16.6 km/s for A, F and G stars respectively. The χ2 ’s of the fits are 14.2, 38.6 and 30.0 for 14, 21 and 30 degrees of freedom respectively. The Gaussian (isothermal) distributions give reasonable fits for A and G stars, but not as good a fit for F stars. In the rest of our analysis, we always use the distributions from data and never their Gaussian fits. 3 Poisson-Jeans Theory The phase space distribution function of a self-gravitating stellar population follows the col- lisionless Boltzmann equation. Assuming the population is in equilibrium, we integrate the Boltzmann equation over velocity to obtain a set of moment equations, also called the Jeans equations [72]. Using cylindrical coordinates (r, φ, z) and focusing on the Jeans equation in the z direction, 1 ∂ 1 ∂ 1 d 2 dΦ (rνi σrz;i ) + (νi σφz;i ) + νi σz;i =− , (3.1) rνi ∂r rνi ∂φ νi dz dz where νi is the stellar number density of the i-th species, σrz (σφz ) are the off-diagonal entries in the velocity dispersion tensor that couple radial (axial) and vertical motions, σz is the –8–

0.14 0.25 A stars A stars F stars F stars 0.12 Early G stars Early G stars 0.20 0.10 0.08 0.15 f0(|w|) f0(w) 0.06 0.10 0.04 0.05 0.02 0.00 0.00 −40 −20 0 20 40 0 10 20 30 40 w [km/s] w [km/s] Figure 5: Midplane velocity distributions of A, F, and early G stars after subtracting w (left). The best-fit Gaussian distribution to f0 (|w|) with error bars that include contributions from the statistical uncertainty due to Poisson error and the asymmetry in −|w| and +|w| bins (right). vertical velocity dispersion (the diagonal zz component of the velocity dispersion tensor) and Φ is the gravitational potential. The first term, usually referred to as the “tilt” term, is negligible for small z: for instance, in case of G stars, σrz < 20 km2 /s2 for |z| . 200 pc [75]. The second term, the so-called “axial” term, is also negligible since our volume of interest is assumed to be (approximately) axisymmetric. In our analysis, we only keep the third term on the left hand side of the Jeans equation, leading to a simple solution for the i-th population, 2 νi (z) = νi (0)e−Φ(z)/σz;i . (3.2) where we assume that each population is well thermalized near the galactic plane and thus take σz;i to be a constant. If all constituents of a population have the same mass, then the mass density ρi is proportional to the number density νi and satisfies, 2 ρi (z) = ρi (0)e−Φ(z)/σz;i . (3.3) The gravitational potential is determined by the mass density of the local neighborhood through the Poisson equation, ∂2Φ 1 ∂ ∂Φ ∇2 Φ = + r = 4πGρtot (z), (3.4) ∂z 2 r ∂r ∂r We treat the effective contribution from the radial term, 1r ∂r ∂ r ∂Φ ∂r , as a constant mass density8 with a value (3.4 ± 0.6) × 10−3 M /pc3 determined from the TGAS data [76]. 8 For an axisymmetric system, the radial term can be related to Oort’s constants. Strictly speaking, the Oort’s constants and consequently the radial term also depend on z. However, since our tracers only explore a small volume close to the midplane, the variation is smaller than the measurement uncertainty [39]. –9–

The total mass density, ρtot , contains contributions from Nb baryon components, DM in the halo, and other gravitational sources such as thin DD. The mass density for the baryons is given by the Bahcall model that consists of a set of isothermal components for gas, stars, and star remnants [77–79]. Each isothermal component is characterized by the midplane density, ρ(0), and the vertical velocity dispersion, σz as shown in Table 2. We adapt this table from Ref. [67], who, in turn, compiled it from the results of Ref. [27] and supplemented with velocity dispersions from Refs. [25, 28].9 The baryon mass densities as a function of z can be constructed in a straightforward manner using Eq. (3.3). We approximate the density of halo DM in the disk, ρDM , to be constant. As shown by Eq. (28) in Ref. [39], the DM density at or below 200 pc is equal to that in the midplane up to a 2% correction. Baryonic components ρ(0) [M /pc3 ] σz [km/s] Molecular gas (H2 ) 0.0104 ± 0.00312 3.7 ± 0.2 Cold atomic gas (HI (1)) 0.0277 ± 0.00554 7.1 ± 0.5 Warm atomic gas (HI (2)) 0.0073 ± 0.0007 22.1 ± 2.4 Hot ionized gas (HII ) 0.0005 ± 0.00003 39.0 ± 4.0 Giant stars 0.0006 ± 0.00006 15.5 ± 1.6 MV < 3 0.0018 ± 0.00018 7.5 ± 2.0 3 < MV < 4 0.0018 ± 0.00018 12.0 ± 2.4 4 < MV < 5 0.0029 ± 0.00029 18.0 ± 1.8 5 < MV < 8 0.0072 ± 0.00072 18.5 ± 1.9 MV > 8 (M dwarfs) 0.0216 ± 0.0028 18.5 ± 4.0 White dwarfs 0.0056 ± 0.001 20.0 ± 5.0 Brown dwarfs 0.0015 ± 0.0005 20.0 ± 5.0 Table 2: Bahcall model for baryons adapted from Ref. [67]. In models with a thin DD, we assume that the DD is isothermal, axisymmetric, and perfectly aligned with the baryonic disk. Following Ref. [80], we choose the parametrization of the thin DD density to be, ΣDD 2 z ρDD (z) = sech , (3.5) 4hDD 2 hDD where ΣDD is the surface density and hDD is the disk height. A thin DD aligned with the baryonic disk contributes an additional source of attractive potential, which pulls baryonic matter towards the midplane (see Section 2.2 of Ref. [57] for an example with a toy model). This results in a narrowed vertical density profiles of tracers, as illustrated in Fig. 6. For a given mass model characterized by 2Nb baryonic parameters, ρDM , ΣDD and hDD , 9 We anticipate that future analyses could relax the assumption regarding isothermality of the baryon components and adopt a self-consistent, data-driven approach for modeling the baryon mass density. For example, the mass density for all stellar components could be constructed directly from the Gaia data. – 10 –

the total energy density, ρtot , can be written as, Nb 2 ρi (0)e−Φ(z)/σz;i + ρDM + ρDD (z). X ρtot (z) = (3.6) i=1 Plugging the expression into Eq. (3.4), we can solve the resulting second-order differential equation numerically with scipy.ODEint to obtain the gravitational potential as a function of z. We also explicitly check that our results agree with that of the iterative solver used by Refs. [57, 67]. 0 −1 −2 ln(ν/ν0 ) −3 −4 No dark disk −5 ΣDD = 20 M /pc2 , hDD = 10 pc −200 −150 −100 −50 0 50 100 150 200 z [pc] Figure 6: The predicted number density of a tracer in a model containing a thin DD with surface density ΣDD = 20 M /pc2 and scale height hDD = 10 pc (dashed). For comparison, we also plot the prediction of a model with the same matter content but without the thin DD (solid). After computing the gravitational potential for a given model, we can combine it with the midplane vertical velocity distribution to predict the number density of tracers. If the i-th type of tracer is in equilibrium and its vertical distribution is independent of R and φ, its phase space distribution satisfies the Boltzmann equation in the z direction, ∂fi ∂Φ ∂fi w − = 0. (3.7) ∂z ∂z ∂w whose solution takes the form fi (z, w) = F w2 /2 + Φ(z) . In addition, if the distribution function is separable in phase space, Z Z dwfi (z, w) = νi (z) → fi (z, w) = νi (z)fi,z (w) with dvz fi,z (w) = 1, (3.8) where fi,z (w) is the vertical velocity distribution function at scale height z. Finally, we – 11 –

integrate the distribution function over velocity to obtain the density distribution [33], Z ∞ Z ∞ p νi (z) = 2 dwfi (z, w) = 2 dwfi (0, w2 + 2Φ(z)) 0 0 Z ∞ p = 2νi (0) dwfi,z=0 ( w2 + 2Φ(z)) 0 Z ∞ f (|w|) w dw = 2νi (0) √ p0 , (3.9) 2Φ(z) w2 − 2Φ(z) where fi,z=0 (|w|) is the midplane velocity distribution for the ith tracer determined from data as shown in the right panel of Fig. 5. 4 Statistics and Data Analysis We analyze the ingredients described in previous sections within a Bayesian framework. For each tracer population, i.e. A, F and G stars, we constrain the local DM content by adding to the baryonic Bahcall model: either a) a constant density contribution from the DM halo, ρDM ; or b) ρDM and a thin DD, as defined in Eq. (3.5), parametrized by its surface density, ΣDD , and scale height, hDD . In Section 4.1, we discuss our choices for the prior distribution, and details of the likelihood function and uncertainty analysis, before presenting an overview of the MCMC sampling procedure in Section 4.2. 4.1 Basic setup: Priors, likelihood, and uncertainties Our model M is characterized by θ = {ψ, ξ}, such that ψ = {ρDM , ΣDD , hDD } are our parameters of interest while ξ are the nuisance parameters. These include: midplane densities, ρk (0), and velocity dispersion, σz;k , for each baryonic component in the Bahcall model; overall normalization constants for each stellar population, Nν ; height of the sun above the midplane, z . We assume uniform prior distributions for all parameters except the baryonic ones; their priors are assumed to follow Gaussian distributions, ! Nb (ρk − ρ̄k )2 (σz,k − σ̄z,k )2 Y 1 1 pb (ζ|M) = q exp − 2 exp − , 2 σσ2z,k q 2πσρ2k 2 σρk 2πσ 2 k=1 σz;k (4.1) where the mean and variance for each component are taken from Table 2. We summarize the details and ranges of assumed prior distributions for all parameters, θ, used in our analysis in Table 3. The predicted number density is constructed by integrating the midplane velocity dis- tribution using Eq. (3.9), and applying Gaussian kernel smoothing to approximate the effect of parallax uncertainties that smear the exact positions of stars. However, since the parallax uncertainties in DR2 are significantly reduced as compared to TGAS, this procedure only has a negligible effect on the predicted density. For each population, the predicted number density is compared to the distribution from the data with a likelihood function Nz ! Y 1 (ln(Nν νimod (θ)) − ln νidata )2 pν (d|M, θ) = q exp − 2 (θ) , (4.2) i=1 2πσ 2 2 σ ln ν i ln νi – 12 –

Parameters Prior type Range Total ρk (0), σz;k Gaussian Eq. (4.1) 24 Nν Uniform [0.9, 2.0] 3 z Uniform [−30.0, 30.0] pc 1 hDD Uniform [0.0, 100.0] pc 1 ρDM Uniform [0.0, 0.06] M /pc3 1 ΣDD Uniform [0.0, 30.0] M /pc2 1 Table 3: Prior distributions of model parameters. where Nz is the number of z bins, νimod is the prediction of a model with parameters θ and νidata is volume complete number density constructed from data, as described in Sec. 2.2. We do not multiply the likelihood functions for different stellar populations in our analysis since doing so assumes all populations are similar and trace the same galactic potential independently. This is a rather simplified assumption which ignores the evolution history of different stellar types. We comment more on this in Section 5.1.1. The squared error σln 2 νi is obtained by adding in quadrature the data and the prediction errors, 2 2 mod 2 data σln νi (θ) = σln νi (θ) + σln νi . (4.3) 2 data The data uncertainty σln νi is discussed in Sec. 2.2, whereas for a fixed set of θ, the 2 mod prediction uncertainty σln νi originates from the uncertainties of the velocity profile fz=0 (|w|). The uncertainty consists of two sources: a) the statistical uncertainty due to the finite sample size, and b) the systematic uncertainty due to possible non-equilibrium effects, which we characterize by the difference between fz=0 (w > 0) and fz=0 (w < 0) following the treatment in Ref. [67]. 2 mod Direct error propagation from uncertainties of fz=0 (|w|) to σln νi by derivatives proves to be difficult due to the large number of parameters and their correlations involved. Instead, we estimate the errors by bootstrap resampling. The bootstrap is a technique that extracts statistical estimators, like mean and standard deviation, by repeated random sam- pling of a data set with replacement. For each stellar type, the raw midplane star data sets are bootstrapped many times to generate many different velocity distributions. For every distribution, we use Eq. (3.9) to derive a predicted density distribution. The statistical un- certainty is extracted from the shape fluctuation in the collection of the predicted density distributions. We approximate the systematic uncertainty due to non-equilibrium effect by computing the difference between predictions based on the distributions of subsets of velocity data with w > 0 and w < 0. We find that the systematic uncertainties dominate over the statistical ones in the prediction error. More details of the bootstrap procedure can be found in Appendix D. Our statistical analysis closely follows that of Ref. [67] with one major difference: the treatment of velocity uncertainties. In Ref. [67], normalization of each velocity bin is also treated as a nuisance parameter, which adds an additional 20-30 parameters to the analysis. In our approach, we propagate the velocity uncertainties, both statistical, estimated using bootstrap resampling, and systematic, into the prediction uncertainties. We check that these – 13 –

two methods yield similar results for TGAS and DR2 data. The sources of uncertainties in our analysis and their corresponding treatment are summarized in Table 4. Type Source Treatement √ Poisson Nk in the k-th bin ν data 3% dust extinction 0.03 × ν data Gaia systematic uncertainty ±0.1 mas in $; ±0.1 mas/yr in µαe , µδ statistical errors of fz=0 (|w|) bootstrap resampling ν mod fz=0 (w > 0) − fz=0 (w < 0) | ln ν (+) (z) − ln ν (−) (z)| parallax uncertainty Gaussian kernel smoothing Table 4: Uncertainties in our analysis. 4.2 Bayesian Analysis We adopt the Bayesian approach to estimate values of parameters and determine correlations between them. The posterior probability density function (simply the posterior henceforth) of the parameters can be defined using Bayes’ theorem, p(d|M, θ)p(θ|M) p(θ|M, d) = , (4.4) p(d|M) where the numerator is given by Eqs. (4.1) and (4.2) and the denominator, referred to in the literature as ‘marginal likelihood’ or ‘evidence’, is defined as Z p(d|M) = p(d|M, θ)p(θ|M) dθ. (4.5) We sample the posterior in Eq. (4.4) with the Markov Chain Monte Carlo (MCMC) sampler emcee10 . To draw samples from a d-dimensional parameter space, emcee implements the affine-invariant ensemble sampling algorithm of Ref. [81] that is based on simultaneously evolving an ensemble of N walkers. Since each walker in the ensemble independently samples the posterior, emcee is naturally suited for parallel computing on multicore systems (see Ref. [82] for more details). In our implementation, we use (100-300) walkers for (15000-25000) steps depending on the stellar type and components (ρDM or ρDM + thin DD) of the local DM content. These numbers are chosen to achieve an acceptance fraction af ≈ 0.3 [83] for each walker. After accounting for the ‘warm-up’ time, ∼4000 steps, of the ensemble, we obtain > 6 ∼ 2 × 10 samples on average for each iteration of our analysis. 5 Results and Discussion We discuss the results from the MCMC sampling of the posterior for different local DM com- ponents using DR2 data in Section 5.1. Since our statistical analysis closely follows Ref. [67], we cross-validate it by repeating the procedure outlined in Sec. 4.1 with TGAS data in the same galactic volume, and compare the results with those from DR2 in Section 5.2. Although 10 http://dfm.io/emcee/current/ – 14 –

we only compare the results of our respective analyses with a thin DD, the conclusions should also hold for the case with only ordinary DM and no thin DD as well. We comment on possible interpretations of our result for ρDM , the prospect of discovering a thin DD, and the robustness of our kinematic analysis in Section 5.3. 5.1 Local DM Content Using Gaia DR2 5.1.1 Local DM Density A stars F stars 0.12 0.12 0.11 0.11 0.10 0.10 ρb [M /pc3] ρb [M /pc3] 0.09 0.09 0.08 0.08 0.07 0.07 0.06 0.06 0.01 0.02 0.03 0.04 0.05 0.02 0.04 0.06 0.08 3 ρDM [M /pc ] ρDM [M /pc3] Early G stars 0.12 0.11 0.10 ρb [M /pc3] 0.09 0.08 0.07 0.06 0.01 0.02 0.03 0.04 0.05 ρDM [M /pc3] Figure 7: Marginalized posteriors indicating the degeneracy between the local densities of baryons ρb and halo DM ρDM . We summarize the results from the posterior sampling for the analysis with baryons and a constant halo DM density ρDM in Table 5. The median value of ρDM obtained through our kinematic analysis of A and early G stars are similar to each other, while using F stars – 15 –

yields a significantly higher value. We also note that our value of ρDM determined using A and early G stars is consistent with previous measurements made using SDSS/SEGUE G star data [84], ρDM = 0.012+0.001 3 +0.025 3 −0.002 M /pc (within 1σ) and ρDM = 0.008−0.025 M /pc (within 2σ), by Refs. [41] and [26] respectively. Stellar type ρDM [M /pc3 ] ρDM [GeV/cm3 ] ρb [M /pc3 ] z [pc] A stars 0.023+0.010 −0.010 0.874+0.380 −0.380 0.089+0.007 −0.007 4.95+3.78 −4.15 F stars 0.047+0.006 −0.007 1.786+0.228 −0.266 0.091+0.007 −0.006 2.52+2.58 −2.74 G stars 0.021+0.014 −0.011 0.798+0.532 −0.418 0.090+0.007 −0.007 −8.46+4.61 −4.09 Table 5: Median posterior values with 1σ errors for the local densities of baryons ρb and halo DM ρDM , and height of the sun above the midplane z . The halo DM density ρDM is expressed in both M /pc3 (astronomical unit) and GeV/cm3 (particle physics unit), where 1 M /pc3 ≈ 38 GeV/cm3 . While the 95% credible region (CR) for measurements of ρDM with A, F, and early G stars in Fig. 7 overlap and seem consistent with each other at the 2σ level, we emphasize that each tracer population doesn’t necessarily probe the same galactic environment (for instance, sensitivity to non-equilibrium features of the MW [70]) due to differences in age and star formation history. Consequently, without appropriate modeling of all prior information in a Bayesian framework, results derived from different tracers should be compared with caution. 5.1.2 Constraints on a Thin DD We perform a full MCMC scan of the posterior after including a thin DD component along with local density of halo DM ρDM , and plot the marginalized posteriors for thin DD pa- rameters, ρDM , and the total midplane baryon density ρb in Figs. 17–19. We find that after marginalizing over the uncertainties of the baryon mass model and asymmetries in velocity distribution, none of the tracers exclude zero surface density ΣDD for the thin DD at the 1σ level. Given the exploratory nature of our analysis, this may be interpreted, at best, as an approximate upper bound on the thin DD parameters. 5.2 Comparison of Constraints between DR2 and TGAS We plot the 95% CR upper limit contours for the thin DD parameters using data from DR2 (TGAS) in the left (right) panel of Fig. 8. Both sets of exclusion curves are significantly stronger than previous results based on the Hipparcos catalog [57]. However, there are obvious differences between our results derived using DR2 and TGAS data.11 Using TGAS data, early G stars exclude ΣDD > 3 ∼ (5 − 10) M /pc depending on hDD while A stars set the weakest constraint. On the other hand, using DR2 data, A stars exclude ΣDD > 3 ∼ (5 − 15) M /pc while the weakest constraint is due to F stars. Naively, we would expect that there might be a (modest) improvement in the constraints from DR2 data compared to those from TGAS due to increased statistics (about a factor of ∼2.5) and decreased parallax uncertainties (due to our choice of binning, these only affect the high z 11 Our TGAS results roughly agree with Ref. [67], Fig. S12 in particular, although their plot was made using the profile likelihood method while the contours in Fig. 8 have been obtained using a fully Bayesian analysis. We obtained a similar result when we repeated our analysis with the profile likelihood method, which is shown in Appendix E. – 16 –

Gaia DR2 TGAS 20.0 20.0 17.5 17.5 15.0 F 15.0 G ΣDD [M /pc2] ΣDD [M /pc2] 12.5 12.5 A 10.0 A 10.0 7.5 7.5 F G 5.0 5.0 2.5 2.5 0.0 0.0 20 40 60 80 100 20 40 60 80 100 hDD [pc] hDD [pc] Figure 8: 95% CR upper limit contours for surface density ΣDD and scale height hDD of a thin DD for A (blue), F(green), and G (orange) stars using data from DR2 (left panel) and TGAS (right). bins). We check numerically that if we take central values from TGAS and uncertainties from DR2 to generate mock distributions for the tracers, the derived constraints on thin DD are indeed similar to those from TGAS data with minor improvements. Given this expectation, it seems counterintuitive that our DR2 constraints are different from the TGAS ones. Before discussing possible origins of the differences for each tracer population, we note that adding more matter pinches the density profile of tracer stars, such as the effect of thin DD discussed in Sec. 3. Thus, the narrower the profile from data or broader the predicted density is, the more matter that can be included, and weaker the constraint on local DM content. The significant weakening of constraints for F stars stems from small differences in the midplane velocity distributions, as shown in the right panel of Fig. 9. The DR2 velocity distribution is slightly broader. We verify that this trend in the velocity distribution is not an artifact of our choice of the midplane latitude cut or the binning of the velocity data. Although velocity (and vertical density) profiles from TGAS and DR2 are consistent with each other within uncertainties, the predicted density distribution with DR2 data is broader than that with TGAS data with fixed model parameters (one example is shown in the left panel of Fig. 9). As a result, a higher density in DM components is required to fit the predicted density of F stars to the DR2 number density profile for given baryon parameters. We also present the volume complete number density profiles and midplane velocity distributions for A and early G stars in Fig. 10 and Fig. 11. From the plots, we note that all the distributions based on TGAS and DR2 data for both these tracers are also consistent within uncertainties, yet there are subtle differences. For the number density profiles, a) there is a narrowing of the DR2 profile at high z due to a reduction in parallax uncertainties for both A and early G stars; b) the DR2 profile of G stars is consistently narrower below the midplane. The velocity distributions using DR2 data are smoother compared to the TGAS ones with smaller systematic uncertainties from asymmetry between negative and positive velocity data. The constraint from early G stars in the DR2 data set gets weaker due to both: a slightly – 17 –

TGAS 0.0 Gaia DR2 0.08 −0.2 0.06 −0.4 ln(ν/ν0) f0(|w|) −0.6 0.04 −0.8 0.02 −1.0 TGAS Gaia DR2 0.00 −200 −100 0 100 200 0 10 20 30 40 z [pc] w [km/s] Figure 9: F stars: (left) volume complete number density profiles overlaid with the predicted density derived using the mean TGAS and DR2 velocity distributions assuming fiducial values for baryons and ρDM = 0.02 M /pc3 ; (right) midplane velocity distributions with interpolated fits to the data. Note that the TGAS velocity distribution has a bin size of 2 km/s while DR2 bin size is 1.5 km/s. A stars Early G stars 0.1 0.0 0.0 −0.5 −0.1 ln(ν/ν0) ln(ν/ν0) −1.0 −0.2 −1.5 −0.3 −2.0 −0.4 TGAS TGAS −2.5 −0.5 Gaia DR2 Gaia DR2 −200 −100 0 100 200 −200 −100 0 100 200 z [pc] z [pc] Figure 10: Comparison of volume complete number density profiles in TGAS and DR2 data for A (left) and G (right) stars. narrower density profile, and a slightly broader predicted density. However, in the case of A stars, the constraint gets considerably stronger at high hDD due to the reduction in the systematic errors from the asymmetry in the midplane velocity distribution. We reiterate that Gaia DR2 should be regarded as a different data catalog from TGAS, rather than just a statistical improvement over it [2]. DR1 incorporated positions from the Tycho-2 catalog to generate the five-parameter astrometric solution in the TGAS catalog, whereas, the DR2 catalog is independent from any other external catalogs with its own self- consistent astrometric solution. Any comparison between the constraints on local DM content from TGAS and DR2 should be made bearing this difference in mind. – 18 –

A stars Early G stars 0.08 0.20 TGAS TGAS Gaia DR2 0.07 Gaia DR2 0.06 0.15 0.05 f0(|w|) f0(|w|) 0.10 0.04 0.03 0.05 0.02 0.01 0.00 0.00 0 10 20 30 40 0 10 20 30 40 w [km/s] w [km/s] Figure 11: Comparison of midplane velocity distributions in TGAS and DR2 data for A (left) and G (right) stars. Note that the TGAS velocity distribution has a bin size of 2 km/s. 5.3 Possible Interpretation of Our Measurement of ρDM Our main results from the MCMC sampling of the posterior, e.g. for A stars, imply that the local DM content can accommodate a constant density ρDM = 0.023 ± 0.010 M /pc3 , or ρDM = 0.011+0.012 3 +4.53 2 −0.010 M /pc and a thin DD with ΣDD = 3.74−2.73 M /pc , the precise value depending on hDD . We observe that the 1σ errors are fairly large in both cases and suggest a poor modeling of the systematics in the predicted density, a latent degeneracy between DM and baryons at low z, or, more likely, a combination of both effects. We elaborate upon these ideas in the rest of the section. An implicit assumption in our modeling of the tracer density profile is that the local neighborhood is axisymmetric and the stellar disk is in dynamic equilibrium. However, grow- ing evidence for disequilibria at |z| > ∼ 0.4 kpc: asymmetry in the vertical number counts [85]; vertical waves in the disk at Sun’s position [86–88]; substructure in the velocity distribution of stars in DR2 data [89–91], warrants a closer look at sources of disequilibria in the solar neighborhood using DR2 data. We defer searches of local disequilibria and the corresponding revision of our traditional kinematic method outlined in Sec. 3 to future work. Presently, we only approximate the effect on non-equilibrium behavior by propagating the asymmetry in the midplane velocity distribution to the error in the predicted density. The marginalized posterior for each tracer in Fig. 7 indicates a strong degeneracy be- tween measurements of ρb and ρDM . As proposed by Ref. [77], and recently implemented on simulated data by Ref. [74], this degeneracy can only be broken if any kinematic analysis includes the density falloff at larger |z| away from the midplane. Since most of the baryonic matter is confined to the stellar disk with a scale height O(kpc), any excess matter that causes the falloff can be attributed to (at least to leading order) to DM, allowing a more pre- cise measurement of ρDM with smaller error bars. On the other hand, this introduces another layer of complexity as the tilt term in Eq. (3.3) that couples the radial and vertical motions is no longer negligible at |z| > ∼ 0.5 kpc and must be modeled by simultaneously fitting to the σRz data [40, 75]. Meanwhile, the highly diagonal posterior in the ρDM –ΣDD plane combined with identi- – 19 –

cally flat posterior in the ρDM –ρb and ΣDD –ρb planes of Figs. 17–19 implies that introducing a thin DD in our analysis merely shifts some of the DM density from ρDM while increasing its relative error. Thus, to set realistic constraints on, or seek evidence for, DM density in the thin DD (or equivalently some form of extended substructure near the midplane) using our procedure, we would need more physical insight into breaking the degeneracy between DM in the halo and a thin DD. In the language of statistics this translates to expanding the likelihood function with more data, and using hierarchical modeling to define a general class of models for the local DM content such that our two scenarios: ρDM , and thin DD + ρDM emerge as special cases with appropriate model-dependent posterior probabilities. Moreover, we always assume that the thin DD is perfectly aligned with the baryonic disk. Since there are no numerical simulations for the thin DD model, the validity of the alignment assumption is unknown. Modifying our analysis to account for a tilted disk could yield different constraints. As the above discussion indicates, our results are dominated by systematic errors stem- ming from an approximate modeling of non-equilibrium behavior and a strong degeneracy between different matter components near the midplane. We note that these errors, in the context of our method, may not be reduced significantly in future Gaia data releases. 6 Conclusions and Outlook We apply the 1D distribution function method to Gaia DR2 and use stellar kinematics in the solar neighborhood to constrain the local DM density and properties of a thin DD aligned with the baryonic disk by performing our analysis within a Bayesian framework. We adopt young A, F, and early G stars as tracers since they have shorter equilibration timescales and consequently are expected not to be strongly affected by disequilibria. Using A stars gives an estimate of ρDM = 0.023 ± 0.01 M /pc3 and sets the strongest constraint on the thin DD, excluding ΣDD > 2 ∼ (5-15) M /pc depending on the scale height with 95% confidence. While we obtain similar results from early G stars, F stars seem to prefer a much higher value of the local DM content. Even though the distributions derived from DR2 are consistent with those from TGAS data within uncertainties, the allowed DM density and parameters of DD model are quite different for all tracers. In light of these results, we address the origins of the differences and discuss the robustness of our kinematic analysis. Our results also suggest that we need a better understanding of the physical origin of the systematic uncertainties, which we include in our analysis to account for the asymmetry in the midplane velocity distributions of tracers. One possibility is that with complete data for radial velocities, we could define the midplane region using the z-cut instead of the b-cut and obtain a more precise determination of the velocity distribution. Another possibility is to take a closer look at local disequilibria and their effects on traditional kinematic methods. Although we do not find any statistically significant evidence for non-equilibrium in the vertical density and velocity distributions in our samples, several analyses based on DR2 seem to suggest various sources of disequilibria at distances larger than the heliocentric cylinder we consider. In terms of baryon modeling, it could be useful to find a self-consistent, data-driven approach to determine the baryon distributions instead of assuming the isothermal Bahcall model. One way to achieve this would be to construct the mass density for stars directly from the data rather than treating it as an isothermal disk. For a more precise determination of the local DM density, the Poisson-Jeans analysis could be applied to tracers at heights greater than the scale height of the stellar disk to minimize the latent degeneracy between baryons and DM. However, besides modeling effects – 20 –

of disequilibria, an analysis at larger scale height has to go beyond the 1D method and must include terms that couple the motions of tracers in different directions. We also see a degeneracy between parameters of ordinary DM and thin DD in the marginalized posteriors obtained through MCMC sampling. To break the degeneracy, we would need to distinguish between their effects on tracers, develop new observables, and model priors that reflect these differences. Acknowledgments We thank Ian Dell’Antonio, Eric Kramer, Matt Reece, Ben Safdi and Chih-Liang Wu for useful discussions. JB would like to thank Nicolas Garcia-Trillos and Alexander Fengler for extended conversations on MCMC sampling methods and Bayesian statistics. This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos. esa.int/Gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/Gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. It also makes use of data products from the Two Micron All Sky Survey (2MASS), which is a joint project of the University of Massachusetts and the Infrared Processing and Analysis Center/California Institute of Technology, funded by the National Aeronautics and Space Administration (NASA) and the National Science Foundation (NSF). The results in this work were computed using the following open-source packages: astropy [92], gala [93], gaia_tools [69], and emcee [82]. JF is supported by the DOE grant DE-SC- 0010010 and NASA grant 80NSSC18K1010. A Color-magnitude Modeling Effective completeness Effective completeness 1.0 12 12 0.8 0.8 10 10 8 0.6 8 0.6 J J 6 6 0.4 0.4 4 4 0.2 0.2 2 2 0 0.0 0 0.0 0.0 0.5 1.0 0.0 0.5 1.0 J − Ks J − Ks Figure 12: The effective completeness in color-magnitude space. Left: 3 J − Ks bins. Right: 20 J − Ks bins. We use the gaia_tools package [69], developed for TGAS, for constructing the selection function and computing the effective completeness. In gaia_tools, the infrared color is divided into three bins in the range −0.05 < J − Ks < 1.05. In each bin, the completeness is – 21 –

an interpolating function of JG , a modified magnitude function that removes the strong color dependence of TGAS completeness at the faint end J∼12. Since the faint end of DR2 extends well beyond J > 12, we use the J magnitude instead of JG for our computation of effective completeness. As a consistency check, we also vary the J − Ks color binning from 3 and 20 bins (Fig. 12) and find that the variation of the density profiles is less than 2%. Thus, we conclude that the effect of color-magnitude modeling is negligible. B Uncertainty Analysis In this section, we discuss our choices of bin sizes in the vertical height z and velocity w for constructing the number density and midplane velocity distribution respectively. R = 150 pc R = 200 pc R = 250 pc 10 10 10 A stars A stars A stars F stars F stars F stars 8 8 8 Early G stars Early G stars Early G stars 6 6 6 σ|z| [pc] σ|z| [pc] σ|z| [pc] 4 4 4 2 2 2 0 0 0 50 100 150 200 50 100 150 200 50 100 150 200 |z| [pc] |z| [pc] |z| [pc] Figure 13: 1σ spread in the uncertainty (at leading order) of z as a function of z for different radial cuts. The uncertainty in z is given by, 2 2 2 sin b 2 cos b 2 2 sin b cos b δz (kpc ) = σ$ + σb2 + 2 σ$b (B.1) $2 $ $3 which is dominated by the parallax uncertainty due to the extra factor of $ in unit of mas ≈ 10−9 in the first term. We plot the uncertainty in z (at leading order) as a function of z for all tracers in Fig. 13. Although the maximum uncertainty is ≈ 10 pc, we conservatively adopt 20 pc as the bin size to account for the underestimation of the reported uncertainties in DR2 [3]. Similarly, the uncertainty in w is σ 2 σ 2 σ 2 w $ µb = + + subleading terms. (B.2) w $ µb where the omitted terms are suppressed by 10−2 when |b| < 5◦ . Around the midplane, σµb /µb < ∼ 0.2, which translates to σw ≈ 1.5 km/s. Therefore, we pick 1.5 km/s as the bin size for obtaining the f0 (w) profile. – 22 –

C Variation of Midplane Cut The midplane velocity profile is required in Eq. (3.9) to predict the tracer density for a given mass model. With partial radial velocity measured by Gaia, we define the midplane in two ways: one is putting a cut on the galactic latitude |b| < 5o while the other is requiring |z| < (20 − 50) pc [74]. For both samples, we approximate vR by its mean value hvR i in Eq. (2.2) when the star’s vR is not measured. However, in the z-cut sample, we discard stars with |b| > 5o that do not have any vR data. The midplane velocity distributions of the z- and b-cut samples are presented in Fig. 14 and agree with each other within 1σ uncertainties. We note that the uncertainties in the midplane velocity data using z-cut are smaller than those using the b-cut. The uncertainties are dominated by systematics due to differences between f (w > 0) and f (w < 0). It turns out that the z-cut data is more symmetric about z = 0 and thus has smaller uncertainties. In our analysis, we still use the b-cut sample, since there could be a potential selection bias in the z-cut sample, in which we discard a considerable fraction of stars with five-parameter astrometric solutions because we don’t know their radial velocities. A stars F stars G stars 0.20 z−cut z−cut z−cut b−cut b−cut b−cut 0.08 0.06 0.15 0.06 f0(|w|) f0(|w|) f0(|w|) 0.04 0.10 0.04 0.05 0.02 0.02 0.00 0.00 0.00 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 w [km/s] w [km/s] w [km/s] Figure 14: Midplane velocity distribution f0 (|w|) for A (left), F (middle) and early G (right) stars. The distributions obtained using the |b| < 5o cut (green) and the |z| < 20 pc cut (blue) are consistent within error bars. D Bootstrap Statistics Bootstrap resampling is a standard statistical technique to acquire the mean and uncertainty when there is only one data set available and analytic propagation of uncertainty cannot be performed easily. The basic idea of the method is described below. Suppose we have a set of N stars labelled as SN = {X1 , X2 , · · · , XN }. Each star Xk is associated with 6 dimensional phase space coordinates denoted by θk . In bootstrap resampling, we make random draws with replacement star-by-star from the original set of stars SN . This generates a new data set SeN of the same size N , with each star labeled as X ek . Since the draws are with replacement, we expect (many) duplicated coordinate values in the new data set, such as Xek = θk and Xek+1 = θk , for large N . Therefore, SeN 6= SN in general. (1) (2) (B) We resample B times the original data set SN , labeling them as SeN , SeN , ..., SeN . The – 23 –

You can also read