Title: | Bayesian Methods to Estimate the Proportion of Liars in Coin Flip Experiments |
---|---|
Description: | Implements Bayesian methods, described in Hugh-Jones (2019) <doi:10.1007/s40881-019-00069-x>, for estimating the proportion of liars in coin flip-style experiments, where subjects report a random outcome and are paid for reporting a "good" outcome. |
Authors: | David Hugh-Jones <[email protected]> |
Maintainer: | David Hugh-Jones <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.0.9000 |
Built: | 2024-11-01 03:54:10 UTC |
Source: | https://github.com/hughjonesd/truelies |
Given two distributions with density functions ,
this calculates:
the probability that the value of the first distribution is greater.
compare_dists(dist1, dist2)
compare_dists(dist1, dist2)
dist1 |
Density of distribution 1, as a one-argument function. |
dist2 |
Density of distribution 2. |
A probability scalar.
d1 <- update_prior(30, 50, P = 0.5, prior = stats::dunif) d2 <- update_prior(25, 40, P = 0.5, prior = stats::dunif) compare_dists(d1, d2)
d1 <- update_prior(30, 50, P = 0.5, prior = stats::dunif) d2 <- update_prior(25, 40, P = 0.5, prior = stats::dunif) compare_dists(d1, d2)
Given two probability density functions dist1
and dist2
, difference_dist
returns the density of “dist1 - dist2'.
difference_dist(dist1, dist2)
difference_dist(dist1, dist2)
dist1 , dist2
|
Probability density functions |
At the moment this only works when dist1 and dist2 are defined on [0, 1]
.
A probability density function defined on [-1, 1]
.
d1 <- update_prior(30, 50, P = 0.5, prior = stats::dunif) d2 <- update_prior(32, 40, P = 0.5, prior = stats::dunif) dd <- difference_dist(d1, d2) dist_hdr(dd, 0.95)
d1 <- update_prior(30, 50, P = 0.5, prior = stats::dunif) d2 <- update_prior(32, 40, P = 0.5, prior = stats::dunif) dd <- difference_dist(d1, d2) dist_hdr(dd, 0.95)
This is a wrapper for hdrcde::hdr
. The highest density region is the
interval that covers conf_level
of the data and has the highest
average density. See:
dist_hdr(dist, conf_level, bounds = attr(dist, "limits"))
dist_hdr(dist, conf_level, bounds = attr(dist, "limits"))
dist |
A one-argument function |
conf_level |
A scalar between 0 and 1 |
bounds |
A length 2 vector of the bounds of the distribution's support |
Rob J Hyndman (1996) “Computing and graphing highest density regions”. American Statistician, 50, 120-126.
A length 2 vector of region endpoints
d1 <- update_prior(33, 50, P = 0.5, prior = stats::dunif) dist_hdr(d1, 0.95)
d1 <- update_prior(33, 50, P = 0.5, prior = stats::dunif) dist_hdr(d1, 0.95)
Find mean of a probability density function
dist_mean(dist, l = attr(dist, "limits")[1], r = attr(dist, "limits")[2])
dist_mean(dist, l = attr(dist, "limits")[1], r = attr(dist, "limits")[2])
dist |
A one-argument function returned from |
l |
Lower bound of the density's support |
r |
Upper bound of the density's support |
A scalar
d1 <- update_prior(10, 40, P = 5/6, prior = stats::dunif) dist_mean(d1)
d1 <- update_prior(10, 40, P = 5/6, prior = stats::dunif) dist_mean(d1)
Find quantiles given a probability density function
dist_quantile(dist, probs, bounds = attr(dist, "limits"))
dist_quantile(dist, probs, bounds = attr(dist, "limits"))
dist |
A one argument function |
probs |
A vector of probabilities |
bounds |
A length 2 vector of the bounds of the distribution's support |
A vector of quantiles
d1 <- update_prior(33, 50, P = 0.5, prior = stats::dunif) dist_quantile(d1, c(0.025, 0.975))
d1 <- update_prior(33, 50, P = 0.5, prior = stats::dunif) dist_quantile(d1, c(0.025, 0.975))
This function creates a prior by fitting a Beta distribution to the heads/N
vector,
using MASS::fitdistr()
. The prior is then updated using data from each
individual sample to give the posterior distributions.
empirical_bayes(heads, ...) ## Default S3 method: empirical_bayes(heads, N, P, ...) ## S3 method for class 'formula' empirical_bayes(formula, data, P, subset, ...)
empirical_bayes(heads, ...) ## Default S3 method: empirical_bayes(heads, N, P, ...) ## S3 method for class 'formula' empirical_bayes(formula, data, P, subset, ...)
heads |
A vector of numbers of the good outcome reported |
... |
Ignored |
N |
A vector of sample sizes |
P |
Probability of bad outcome |
formula |
A two-sided formula of the form |
data |
A data frame or matrix. Each row represents one individual. |
subset |
A logical or numeric vector specifying the subset of data to use |
The formula interface allows calling the function directly on experimental data.
A list with two components:
prior
, the calculated empirical prior (of class densityFunction
).
posterior
, a list of posterior distributions (objects of class densityFunction
).
If heads
was named, the list will have the same names.
heads <- c(Baseline = 30, Treatment1 = 38, Treatment2 = 45) N <- c(50, 52, 57) res <- empirical_bayes(heads, N, P = 0.5) compare_dists(res$posteriors$Baseline, res$posteriors$Treatment1) plot(res$prior, ylim = c(0, 4), col = "grey", lty = 2) plot(res$posteriors$Baseline, add = TRUE, col = "blue") plot(res$posteriors$Treatment1, add = TRUE, col = "orange") plot(res$posteriors$Treatment2, add = TRUE, col = "red") # starting from raw data: raw_data <- data.frame( report = sample(c("heads", "tails"), size = 300, replace = TRUE, prob = c(.8, .2) ), group = rep(LETTERS[1:10], each = 30) ) empirical_bayes(I(report == "heads") ~ group, data = raw_data, P = 0.5)
heads <- c(Baseline = 30, Treatment1 = 38, Treatment2 = 45) N <- c(50, 52, 57) res <- empirical_bayes(heads, N, P = 0.5) compare_dists(res$posteriors$Baseline, res$posteriors$Treatment1) plot(res$prior, ylim = c(0, 4), col = "grey", lty = 2) plot(res$posteriors$Baseline, add = TRUE, col = "blue") plot(res$posteriors$Treatment1, add = TRUE, col = "orange") plot(res$posteriors$Treatment2, add = TRUE, col = "red") # starting from raw data: raw_data <- data.frame( report = sample(c("heads", "tails"), size = 300, replace = TRUE, prob = c(.8, .2) ), group = rep(LETTERS[1:10], each = 30) ) empirical_bayes(I(report == "heads") ~ group, data = raw_data, P = 0.5)
This uses simulations to estimate the power to detect a given level of lying in a
sample of size N
by this package's methods.
power_calc(N, P, lambda, alpha = 0.05, prior = stats::dunif, nsims = 200)
power_calc(N, P, lambda, alpha = 0.05, prior = stats::dunif, nsims = 200)
N |
Total number in sample |
P |
Probability of bad outcome |
lambda |
Probability of a subject lying |
alpha |
Significance level to use for the null hypothesis |
prior |
Prior over lambda. A function which takes a vector of values between 0 and 1, and returns the probability density. The default is the uniform distribution. |
nsims |
Number of simulations to run |
Estimated power, a scalar between 0 and 1.
power_calc(N = 50, P = 0.5, lambda = 0.2)
power_calc(N = 50, P = 0.5, lambda = 0.2)
Using simulations, estimate power to detect differences in lying
using compare_dists()
, given values for , the
probability of lying, in each sample.
power_calc_difference(N1, N2 = N1, P, lambda1, lambda2, alpha = 0.05, alternative = c("two.sided", "greater", "less"), prior = stats::dunif, nsims = 200)
power_calc_difference(N1, N2 = N1, P, lambda1, lambda2, alpha = 0.05, alternative = c("two.sided", "greater", "less"), prior = stats::dunif, nsims = 200)
N1 |
N of sample 1 |
N2 |
N of sample 2 |
P |
Probability of bad outcome |
lambda1 |
Probability of lying in sample 1 |
lambda2 |
Probability of lying in sample 2 |
alpha |
Significance level |
alternative |
"two.sided", "greater" (sample 1 is greater), or "less". Can be abbreviated |
prior |
Prior over lambda. A function which takes a vector of values between 0 and 1, and returns the probability density. The default is the uniform distribution. |
nsims |
Number of simulations to run |
Estimated power, a scalar between 0 and 1.
power_calc_difference(N1 = 100, P = 0.5, lambda = 0, lambda2 = 0.25)
power_calc_difference(N1 = 100, P = 0.5, lambda = 0, lambda2 = 0.25)
densityFunction
.Print/plot an object of class densityFunction
.
## S3 method for class 'densityFunction' print(x, ...) ## S3 method for class 'densityFunction' plot(x, ...)
## S3 method for class 'densityFunction' print(x, ...) ## S3 method for class 'densityFunction' plot(x, ...)
x |
The object |
... |
Unused |
d1 <- update_prior(33, 50, P = 0.5, prior = stats::dunif) d1 plot(d1) # show the actual R code (techies only) unclass(d1)
d1 <- update_prior(33, 50, P = 0.5, prior = stats::dunif) d1 plot(d1) # show the actual R code (techies only) unclass(d1)
update_prior
uses the equation for the posterior:
where is the prior and
is the
probability of R reports of heads given that people lie with probability
:
update_prior(heads, N, P, prior = stats::dunif, npoints = 1000)
update_prior(heads, N, P, prior = stats::dunif, npoints = 1000)
heads |
Number of good outcomes reported |
N |
Total number in sample |
P |
Probability of bad outcome |
prior |
Prior over lambda. A function which takes a vector of values between 0 and 1, and returns the probability density. The default is the uniform distribution. |
npoints |
How many points to integrate on? |
The probability density of the posterior distribution, as a one-argument function.
posterior <- update_prior(heads = 30, N = 50, P = 0.5, prior = stats::dunif) plot(posterior)
posterior <- update_prior(heads = 30, N = 50, P = 0.5, prior = stats::dunif) plot(posterior)