Generalized fiducial inference for logistic regression

The main function of the 'GFIlogisticRegression' package is fidSampleLR. It simulates the fiducial distribution of the parameters of a logistic regression model.

Example

To illustrate it, we will consider a logistic dose-response model for inference on the median lethal dose. The median lethal dose (LD50) is the amount of a substance, such as a drug, that is expected to kill half of its users.

The results of LD50 experiments can be modeled using the relation

\[\textrm{logit}(p_i) = \beta_1(x_i - \mu)\]

where $p_i$ is the probability of death at the dose administration $x_i$, and $\mu$ is the median lethal dose, i.e. the dosage at which the probability of death is $0.5$. The $x_i$ are known while $\beta_1$ and $\mu$ are fixed effects that are unknown.

This relation can be written in the form

\[\textrm{logit}(p_i) = \beta_0 + \beta_1 x_i\]

with $\mu = -\beta_0 / \beta_1$.

We will perform the fiducial inference in this model with the following data:

using DataFrames
data = DataFrame(
  x = [
    -2, -2, -2, -2, -2,
    -1, -1, -1, -1, -1,
     0,  0,  0,  0,  0,
     1,  1,  1,  1,  1,
     2,  2,  2,  2,  2
  ],
  y = [
    1, 0, 0, 0, 0,
    1, 1, 1, 0, 0,
    1, 1, 0, 0, 0,
    1, 1, 1, 1, 0,
    1, 1, 1, 1, 1
  ]
)

25 rows × 2 columns

xy
Int64Int64
1-21
2-20
3-20
4-20
5-20
6-11
7-11
8-11
9-10
10-10
1101
1201
1300
1400
1500
1611
1711
1811
1911
2010
2121
2221
2321
2421
2521

Let's go with $20000$ fiducial simulations:

using StatsModels, GFIlogisticRegression
fidsamples = fidSampleLR(@formula(y ~ x), data, 20000)
(Beta = 20000×2 DataFrame
   Row │ (Intercept)  x
       │ Float64      Float64
───────┼───────────────────────
     1 │    1.03364   3.04606
     2 │    1.25911   1.87752
     3 │    1.77293   1.56713
     4 │    0.841713  1.70577
     5 │    0.799326  0.794347
     6 │    0.380466  1.75835
     7 │    0.180954  1.27897
     8 │    1.66278   0.764177
   ⋮   │      ⋮          ⋮
 19994 │   -0.095358  0.647854
 19995 │   -0.599387  0.274277
 19996 │    1.06014   0.666367
 19997 │    0.267742  0.657362
 19998 │    0.603148  0.650022
 19999 │    1.88123   0.966845
 20000 │    1.31249   1.34249
             19985 rows omitted, Weights = [3.62769350871564e-6, 2.9731452355707228e-5, 3.2719150286775974e-5, 3.4544450402466984e-5, 3.948621495561722e-5, 8.111139070357383e-5, 5.022869913859303e-5, 4.6475037536676876e-5, 8.413937445591667e-5, 2.171685957238218e-5  …  6.839878587355508e-5, 2.8355736963279443e-5, 5.70496814853827e-5, 7.342000115900292e-5, 1.628168889657282e-5, 1.7314750095805775e-5, 1.833238871160176e-5, 2.249293442223631e-5, 2.2035005337609222e-5, 2.2459093573934344e-5])

Here are the fiducial estimates and $95\%$-confidence intervals of the parameters $\beta_0$ and $\beta_1$:

fidSummary(fidsamples)

2 rows × 5 columns

variablemeanmedianlwrupr
StringFloat64Float64Float64Float64
1(Intercept)0.5666780.542482-0.4545851.7196
2x0.9461160.9077320.1635991.95445

Now let us draw the fiducial $95\%$-confidence interval about our parameter of interest $\mu$:

fidConfInt("-:\"(Intercept)\" ./ :x", fidsamples, 0.95)
(lower = -3.230485923227428, upper = 0.73469846184404)

Member functions

GFIlogisticRegression.fidConfIntFunction
fidConfInt(parameter, fidsamples, conf)

Fiducial confidence interval of a parameter of interest.

Arguments

  • parameter: an expression of the parameter of interest given as a string; see the example
  • fidsamples: an output of fidSampleLR
  • conf: confidence level

Example

using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
  y = [0, 0, 1, 1, 1, 1],
  group = ["A", "A", "A", "B", "B", "B"]
)
fidsamples = fidSampleLR(@formula(y ~ 0 + group), data, 3000)
fidConfInt(":\"group: A\" - :\"group: B\"", fidsamples, 0.95)
source
GFIlogisticRegression.fidProbMethod
fidProb(parameter, fidsamples, q)

Fiducial non-exceedance probability of a parameter of interest.

Arguments

  • parameter: an expression of the parameter of interest given as a string; see the example
  • fidsamples: fiducial simulations, an output of fidSampleLR
  • q: the non-exceedance threshold

Example

using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
  y = [0, 0, 1, 1, 1],
  x = [-2, -1, 0, 1, 2]
)
fidsamples = fidSampleLR(@formula(y ~ x), data, 3000)
fidProb("map(exp, :x)", fidsamples, 1) # this is Pr(exp(x) <= 1)
source
GFIlogisticRegression.fidQuantileMethod
fidQuantile(parameter, fidsamples, p)

Fiducial quantile of a parameter of interest.

Arguments

  • parameter: an expression of the parameter of interest given as a string; see the example
  • fidsamples: an output of fidSampleLR
  • p: quantile level, between 0 and 1

Example

using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
  y = [0, 0, 1, 1, 1, 1],
  group = ["A", "A", "A", "B", "B", "B"]
)
fidsamples = fidSampleLR(@formula(y ~ 0 + group), data, 3000)
fidQuantile(":\"group: A\" ./ :\"group: B\"", fidsamples, 0.5)
source
GFIlogisticRegression.fidSampleLRFunction
fidSampleLR(formula, data, N[, gmp][, thresh])

Fiducial sampling of the parameters of the logistic regression model.

Arguments

  • formula: a formula describing the model
  • data: data frame in which the variables of the model can be found
  • N: number of simulations
  • gmp: whether to use exact arithmetic in the algorithm
  • thresh: the threshold used in the sequential sampler; the default N/2 should not be changed

Example

using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
  y = [0, 0, 1, 1, 1],
  x = [-2, -1, 0, 1, 2]
)
fidsamples = fidSampleLR(@formula(y ~ x), data, 3000)
source
GFIlogisticRegression.fidSummaryMethod
fidSummary(fidsamples)

Summary of the fiducial simulations.

Argument

  • fidsamples: an output of fidSampleLR

Example

using GFIlogisticRegression, DataFrames, StatsModels
data = DataFrame(
  y = [0, 0, 1, 1, 1],
  x = [-2, -1, 0, 1, 2]
)
fidsamples = fidSampleLR(@formula(y ~ x), data, 3000)
fidSummary(fidsamples)
source