Generalized fiducial inference for logistic regression
The main function of the 'GFIlogisticRegression' package is fidSampleLR. It simulates the fiducial distribution of the parameters of a logistic regression model.
Example
To illustrate it, we will consider a logistic dose-response model for inference on the median lethal dose. The median lethal dose (LD50) is the amount of a substance, such as a drug, that is expected to kill half of its users.
The results of LD50 experiments can be modeled using the relation
\[\textrm{logit}(p_i) = \beta_1(x_i - \mu)\]
where $p_i$ is the probability of death at the dose administration $x_i$, and $\mu$ is the median lethal dose, i.e. the dosage at which the probability of death is $0.5$. The $x_i$ are known while $\beta_1$ and $\mu$ are fixed effects that are unknown.
This relation can be written in the form
\[\textrm{logit}(p_i) = \beta_0 + \beta_1 x_i\]
with $\mu = -\beta_0 / \beta_1$.
We will perform the fiducial inference in this model with the following data:
using DataFrames
data = DataFrame(
x = [
-2, -2, -2, -2, -2,
-1, -1, -1, -1, -1,
0, 0, 0, 0, 0,
1, 1, 1, 1, 1,
2, 2, 2, 2, 2
],
y = [
1, 0, 0, 0, 0,
1, 1, 1, 0, 0,
1, 1, 0, 0, 0,
1, 1, 1, 1, 0,
1, 1, 1, 1, 1
]
)| x | y | |
|---|---|---|
| Int64 | Int64 | |
| 1 | -2 | 1 |
| 2 | -2 | 0 |
| 3 | -2 | 0 |
| 4 | -2 | 0 |
| 5 | -2 | 0 |
| 6 | -1 | 1 |
| 7 | -1 | 1 |
| 8 | -1 | 1 |
| 9 | -1 | 0 |
| 10 | -1 | 0 |
| 11 | 0 | 1 |
| 12 | 0 | 1 |
| 13 | 0 | 0 |
| 14 | 0 | 0 |
| 15 | 0 | 0 |
| 16 | 1 | 1 |
| 17 | 1 | 1 |
| 18 | 1 | 1 |
| 19 | 1 | 1 |
| 20 | 1 | 0 |
| 21 | 2 | 1 |
| 22 | 2 | 1 |
| 23 | 2 | 1 |
| 24 | 2 | 1 |
| 25 | 2 | 1 |
Let's go with $20000$ fiducial simulations:
using StatsModels, GFIlogisticRegression
fidsamples = fidSampleLR(@formula(y ~ x), data, 20000)(Beta = 20000×2 DataFrame
Row │ (Intercept) x
│ Float64 Float64
───────┼───────────────────────
1 │ 1.03364 3.04606
2 │ 1.25911 1.87752
3 │ 1.77293 1.56713
4 │ 0.841713 1.70577
5 │ 0.799326 0.794347
6 │ 0.380466 1.75835
7 │ 0.180954 1.27897
8 │ 1.66278 0.764177
⋮ │ ⋮ ⋮
19994 │ -0.095358 0.647854
19995 │ -0.599387 0.274277
19996 │ 1.06014 0.666367
19997 │ 0.267742 0.657362
19998 │ 0.603148 0.650022
19999 │ 1.88123 0.966845
20000 │ 1.31249 1.34249
19985 rows omitted, Weights = [3.62769350871564e-6, 2.9731452355707228e-5, 3.2719150286775974e-5, 3.4544450402466984e-5, 3.948621495561722e-5, 8.111139070357383e-5, 5.022869913859303e-5, 4.6475037536676876e-5, 8.413937445591667e-5, 2.171685957238218e-5 … 6.839878587355508e-5, 2.8355736963279443e-5, 5.70496814853827e-5, 7.342000115900292e-5, 1.628168889657282e-5, 1.7314750095805775e-5, 1.833238871160176e-5, 2.249293442223631e-5, 2.2035005337609222e-5, 2.2459093573934344e-5])Here are the fiducial estimates and $95\%$-confidence intervals of the parameters $\beta_0$ and $\beta_1$:
fidSummary(fidsamples)| variable | mean | median | lwr | upr | |
|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | |
| 1 | (Intercept) | 0.566678 | 0.542482 | -0.454585 | 1.7196 |
| 2 | x | 0.946116 | 0.907732 | 0.163599 | 1.95445 |
Now let us draw the fiducial $95\%$-confidence interval about our parameter of interest $\mu$:
fidConfInt("-:\"(Intercept)\" ./ :x", fidsamples, 0.95)(lower = -3.230485923227428, upper = 0.73469846184404)
Member functions
GFIlogisticRegression.fidConfInt — FunctionfidConfInt(parameter, fidsamples, conf)Fiducial confidence interval of a parameter of interest.
Arguments
parameter: an expression of the parameter of interest given as a string; see the examplefidsamples: an output offidSampleLRconf: confidence level
Example
using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
y = [0, 0, 1, 1, 1, 1],
group = ["A", "A", "A", "B", "B", "B"]
)
fidsamples = fidSampleLR(@formula(y ~ 0 + group), data, 3000)
fidConfInt(":\"group: A\" - :\"group: B\"", fidsamples, 0.95)GFIlogisticRegression.fidProb — MethodfidProb(parameter, fidsamples, q)Fiducial non-exceedance probability of a parameter of interest.
Arguments
parameter: an expression of the parameter of interest given as a string; see the examplefidsamples: fiducial simulations, an output offidSampleLRq: the non-exceedance threshold
Example
using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
y = [0, 0, 1, 1, 1],
x = [-2, -1, 0, 1, 2]
)
fidsamples = fidSampleLR(@formula(y ~ x), data, 3000)
fidProb("map(exp, :x)", fidsamples, 1) # this is Pr(exp(x) <= 1)GFIlogisticRegression.fidQuantile — MethodfidQuantile(parameter, fidsamples, p)Fiducial quantile of a parameter of interest.
Arguments
parameter: an expression of the parameter of interest given as a string; see the examplefidsamples: an output offidSampleLRp: quantile level, between 0 and 1
Example
using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
y = [0, 0, 1, 1, 1, 1],
group = ["A", "A", "A", "B", "B", "B"]
)
fidsamples = fidSampleLR(@formula(y ~ 0 + group), data, 3000)
fidQuantile(":\"group: A\" ./ :\"group: B\"", fidsamples, 0.5)GFIlogisticRegression.fidSampleLR — FunctionfidSampleLR(formula, data, N[, gmp][, thresh])Fiducial sampling of the parameters of the logistic regression model.
Arguments
formula: a formula describing the modeldata: data frame in which the variables of the model can be foundN: number of simulationsgmp: whether to use exact arithmetic in the algorithmthresh: the threshold used in the sequential sampler; the defaultN/2should not be changed
Example
using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
y = [0, 0, 1, 1, 1],
x = [-2, -1, 0, 1, 2]
)
fidsamples = fidSampleLR(@formula(y ~ x), data, 3000)GFIlogisticRegression.fidSummary — MethodfidSummary(fidsamples)Summary of the fiducial simulations.
Argument
fidsamples: an output offidSampleLR
Example
using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
y = [0, 0, 1, 1, 1],
x = [-2, -1, 0, 1, 2]
)
fidsamples = fidSampleLR(@formula(y ~ x), data, 3000)
fidSummary(fidsamples)