Author:
Paul Schrimpf
Date: 2019-03-04
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
This document was created using Weave.jl. The code is available in on github. The same document generates both static webpages and associated jupyter notebooks.
\[ \def\indep{\perp\!\!\!\perp} \def\Er{\mathrm{E}} \def\R{\mathbb{R}} \def\En{{\mathbb{E}_n}} \def\Pr{\mathrm{P}} \newcommand{\norm}[1]{\left\Vert {#1} \right\Vert} \newcommand{\abs}[1]{\left\vert {#1} \right\vert} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\argmin}{arg\,min} \def\inprob{\,{\buildrel p \over \rightarrow}\,} \def\indist{\,{\buildrel d \over \rightarrow}\,} \]
using Pkg Pkg.activate("..") Pkg.resolve()
Resolving package versions… Updating ~/565/assignments/PharmacyEntry/Project.toml
[no changes] Updating ~/565/assignments/PharmacyEntry/Manifest.toml
[no changes]
using Revise if (!("../src" ∈ LOAD_PATH)) push!(LOAD_PATH, "../src") end using PharmacyEntry
I have merged everyone’s province specific parsing code, geocoded the pharmacies, and updated the data in the git repository. Update your git folder, and rerun your notebook from part 1 to create a cleandata.csv
with data from all the parsed provinces. When complete, cleandata.csv should have about 222 rows (perhaps less if you decided to eliminate population centres that are too big or too close together).
As in Bresnahan and Reiss (1991), we will assume that the profits per pharmacy in market \(m\) with \(N\) pharmacies is
\[ \begin{align*} \pi_{m,N} = s_m \underbrace{(\alpha_1 + x_m\beta + \sum_{n=2}^N \alpha_n)}_{\text{variable profits}} - \underbrace{\left(\gamma_1 + \delta w_m + \sum_{n=2}^N \gamma_n \right)}_{\text{fixed costs}} + \epsilon_m \end{align*} \]
where \(s_m\) is the size of the market. To simplify, I am omitting the \(\lambda\) and the other size shifting variables from the model. You may add these if you wish.
Let \(\theta = (\alpha, \beta, \gamma)\) denote the model parameters. If we assume \(\epsilon_m\) has cdf \(F_\epsilon()\) (conditional on \(s\), \(x\), and \(w\)), then the likelihood of observing \(N_m\) pharmacies in market \(m\) is
\[ \begin{align*} P(N = N_m | s_m, x_m, w_m; \theta) = & P(\pi_{m,N} \geq 0 \;\&\; \pi_{m,N+1} < 0) \\ = & P\left(-\left[s_m (\alpha_1 + x_m\beta + \sum_{n=2}^{N_m} \alpha_n) - \left(\gamma_1 + \delta w_m + \sum_{n=2}^{N_m} \gamma_n \right)\right] \leq \epsilon_m \leq -\left[s_m (\alpha_1 + x_m\beta + \sum_{n=2}^{N_m+1} \alpha_n) - \left(\gamma_1 + \delta w_m + \sum_{n=2}^{N_m+1} \gamma_n \right)\right] \right) \\ = & F_\epsilon\left(-\left[s_m (\alpha_1 + x_m\beta + \sum_{n=2}^{N_m+1} \alpha_n) - \left(\gamma_1 + \delta w_m + \sum_{n=2}^{N_m+1} \gamma_n \right)\right]\right) - F_\epsilon\left( -\left[s_m (\alpha_1 + x_m\beta + \sum_{n=2}^{N_m} \alpha_n) - \left(\gamma_1 + \delta w_m + \sum_{n=2}^{N_m} \gamma_n \right)\right] \right) \end{align*} \]
The loglikelihood is then
\[ \mathcal{L}(\theta) = \frac{1}{M} \sum_{m=1}^M \log P(N = N_m | s_m, x_m, w_m; \theta), \]
and \(\theta\) can be estimated by maximizing,
\[ \hat{\theta} = \argmax_\theta \mathcal{L}(\theta). \]
Write a function to compute the loglikelihood. You may do this however you want, but I suggest using the following skeleton code.
using Distributions, DataFrames """ brentrymodel(data::AbstractDataFrame, n::Symbol, s::Symbol, x::Array{Symbol,1}, w::Array{Symbol,1}; Fϵ) Create loglikelihood for Bresnehan & Reiss style entry model Inputs: - `data` DataFrame - `n` name of number of firm variable in data - `s` name of market size variable in data - `x` array of names of variable profit shifters - `w` array of names of fixed cost shifters - `Fϵ` cdf of ϵ, optional, defaults to standard normal cdf The same variables may be included in both `x` and `w`. """ function brentrymodel(data::AbstractDataFrame, n::Symbol, s::Symbol, x::Array{Symbol,1}, w::Array{Symbol,1}; Fϵ = x->cdf(Normal(),x)) # skip observations with missings vars = unique([n, s, x..., w...]) inc = completecases(data[vars]) N = disallowmissing(data[n][inc]) S = disallowmissing(data[s][inc]) X = disallowmissing(convert(Matrix, data[x][inc,:])) W = disallowmissing(convert(Matrix, data[w][inc,:])) Nmax = maximum(N) function packparam(α,β,γ,δ) θ = [α;β;γ;δ] end function unpackparam(θ) α = θ[1:Nmax] β = θ[(Nmax+1):(Nmax+size(X,2))] γ = θ[(Nmax+size(X,2)+1):(Nmax+size(X,2)+Nmax)] δ = θ[(Nmax+size(X,2)+Nmax+1):end] (α,β,γ,δ) end # While maximizing the likelihood some parameters might result in # the likelihood being 0 (or very close to 0) taking log would # create problems. Use logfinite from PharmacyEntry.jl instead logf = logfinite(exp(-100.0) ) # could adjust the exp(-100.0) function loglike(θ) (α,β,γ,δ) = unpackparam(θ) error("You must write the body of this function") # P = array of likelihoods for each observation # return(mean(logf.(P))) end return(loglike=loglike, unpack=unpackparam, pack=packparam) end
It is good practice to test any estimation method on simulated data. The function brentrysim
in PharmacyEntry/src/entrymodel.jl
simulates this model. Use it to test your likelihood. Here is some code to simulate. You may need to adjust the parameters to get a decent distribution of number of firms (i.e. not all 0 or 5).
# Simulating data using DataFrames, Statistics, StatsBase import CSV df = CSV.read("cleandata.csv") # Important to scale variables to avoid numerical problems in both # simulation & estimation df[:pop10k] = df[Symbol("Population, 2016")]./10000 df[:logpop10k] = log.(df[:pop10k]) df[:income10k] = df[Symbol("Average total income in 2015 among recipients (\$)")]./10000 df[:density1k] = df[Symbol("Population density per square kilometre")]./1000 df[:logdensity] = log.(df[:density1k]) df[:logarea] = log.(df[Symbol("Land area in square kilometres")]) df[:mediumage] = df[Symbol("15 to 64 years")]./100 # parameters for simulation n_obs_sim = 500 # you might want to adjust this. You want it to be # large enough that your estimates are close to the # true values, but small enough that it doesn't take # too long to estimate # the maximum number of pharmacies in the simulated data will be # length(α) + 1 α = [1.0, -1.] γ = [1.0, 1.] # you may have to adjust the parameters to get a reasonable distribution of # number of pharmacies across markets svar = :pop10k β = [1., 1.] xvars = [:income10k, :mediumage] δ = [1., 1.] wvars = [:logdensity, :logarea] simdf = df[sample(1:nrow(df), n_obs_sim),:] simdf[:nsim] = brentrysim(simdf, svar, xvars, wvars, α,β,γ,δ) println("Distribution of number of firms")
Distribution of number of firms
for i in 0:length(α) println("$(mean(simdf[:nsim].==i))") end
0.434 0.272 0.294
To estimate from the simulated data, you could do the following.
using Optim, ForwardDiff, LinearAlgebra, PrettyTables try using EntrySolution # this contains my code for the likelihood and # it's intentionally not included in the assignment catch end (loglike, unpack, pack) = brentrymodel(simdf, :nsim, svar, xvars, wvars) θ0 = pack(α,β,γ,δ) loglike(θ0) # initial values --- note that you may run into optimization problems # with poor initial values. This is especially likely if # s*cumsum(α)[c] - cumsum(γ)[c] is not decreasing with c. You can # ensure this by making α < 0 and γ>0 βi = zeros(size(β)) δi = zeros(size(δ)) αi = zeros(size(α)) γi = ones(size(γ)) θi = pack(αi, βi, γi, δi); loglike(θi) res = optimize((x)->(-loglike(x)), θi, method=BFGS(), autodiff=:forward, show_trace=true)
Iter Function value Gradient norm 0 1.730088e+00 1.620804e+01 1 1.209517e+00 1.288377e+00 2 7.222989e-01 1.781650e-01 3 6.718382e-01 1.838065e-01 4 6.654975e-01 3.980007e-02 5 6.515625e-01 1.764305e-02 6 6.508047e-01 1.661900e-02 7 6.504096e-01 1.809850e-02 8 6.486085e-01 1.025959e-02 9 6.479257e-01 2.199461e-02 10 6.468372e-01 2.803841e-03 11 6.468231e-01 1.912744e-03 12 6.465327e-01 5.307883e-03 13 6.462838e-01 2.957066e-03 14 6.462581e-01 4.557924e-03 15 6.457693e-01 2.721388e-03 16 6.457607e-01 4.537976e-04 17 6.457601e-01 4.817264e-04 18 6.454308e-01 1.765586e-02 19 6.451313e-01 4.076670e-04 20 6.451306e-01 2.094665e-05 21 6.451306e-01 2.697787e-07 22 6.451306e-01 6.725598e-10
# if you have problems, maybe look at one parameter at a time, e.g. # res = optimize((x)->(-loglike(pack(x, β, γ, δ))), αi, method=BFGS(), autodiff=:forward, show_trace=true) θhat = res.minimizer (αhat, βhat, γhat, δhat) = unpack(θhat) # calculate standard errors H = ForwardDiff.hessian(loglike,θhat) Varθ = -inv(H)./nrow(simdf); (seα, seβ, seγ, seδ) = unpack(sqrt.(diag(Varθ))) # Print a nice(ish) table header= ["Parameter", "Truth", "Estimate", "(SE)"]; param = [["α[$i]" for i in eachindex(α)]; ["β[$i]" for i in eachindex(β)]; ["γ[$i]" for i in eachindex(γ)]; ["δ[$i]" for i in eachindex(δ)]]; # highlight estimates that reject H0 : estimate = true at 99% level h1 = Highlighter( f = (tbl, i, j)->( (j==3 || j==4) && abs((tbl[i,2]-tbl[i,3])/tbl[i,4]).>quantile(Normal(), 0.995)), crayon = crayon"red bold" );
julia> tbl = pretty_table(hcat(param, θ0, θhat, sqrt.(diag(Varθ))), header, formatter = Dict(3 => (v,i) -> round(v,digits=3), 4 => (v,i) -> "($(round(v,digits=3)))"), highlighters=tuple(h1)) ┌───────────┬───────┬──────────┬─────────┐───────────┬───────┬──────────┬─────────┐ │ Parameter │ Truth │ Estimate │ (SE) │ ├───────────┼───────┼──────────┼─────────┤ │ α[1] │ 1.0 │ 2.814 │ (1.842) │ │ α[2] │ -1.0 │ -0.56 │ (0.487) │ │ β[1] │ 1.0 │ 0.836 │ (0.267) │ │ β[2] │ 1.0 │ -2.325 │ (3.307) │ │ γ[1] │ 1.0 │ 0.955 │ (0.135) │ │ γ[2] │ 1.0 │ 1.179 │ (0.195) │ │ δ[1] │ 1.0 │ 0.461 │ (0.334) │ │ δ[2] │ 1.0 │ 0.65 │ (0.302) │ └───────────┴───────┴──────────┴─────────┘
Ideally, you would do this many times, and verify that as the sample size increases, the estimates are close to the true parameters. Note that maximum likelihood is generally only consistent, not unbiased. Also, I have semi-deliberately setup the simulation so that there is a pair of parameters that are not well identified separately. Can you figure out what the problem is?
Estimate the model on the real data. oBriefly discuss your choice of “X” and “W” variables. Be sure to check the output of optimize().
You may have to do some tweaking of initial values and/or optimization algorithm to get convergence. As in the simulation, report both your parameter estimates and standard errors.
Create tables and/or figures that show how well your estimates and model fit the data.
Compute entry thresholds and create a figure similar to Figure 4 from Bresnahan and Reiss (1991). Since this data generally has more pharmacies than in Bresnahan and Reiss (1991), you should probably choose something larger than 5 for the maximum N to plot. Use the delta method to compute standard errors for your \(s_N\) and add confidence bands to the figure.
using ForwardDiff # delta method demo estimate = [1.0, 1.0] variance_estimate = [1.0 0.0; 0.0 2.0] function func_of_estimate(θ) # you'd replace this with your function to calculate the size thresholds [sum((θ) ./ (θ.*θ)), θ[1]*θ[2], θ[1] - θ[2]] end sn = func_of_estimate(estimate) ∇sn = ForwardDiff.jacobian(func_of_estimate, estimate) variance_sn = ∇sn*variance_estimate*∇sn'
3×3 Array{Float64,2}: 3.0 -3.0 1.0 -3.0 3.0 -1.0 1.0 -1.0 3.0
Bresnahan, Timothy F., and Peter C. Reiss. 1991. “Entry and Competition in Concentrated Markets.” Journal of Political Economy 99 (5): pp. 977–1009. http://www.jstor.org/stable/2937655.