API References
Main Functions in the npDoseResponse Package
- npDoseResponse.npDoseResponse.DerivEffect(Y, X, t_eval=None, h_bar=None, kernT_bar='gaussian', h=None, b=None, C_h=7, C_b=3, print_bw=True, degree=2, deriv_ord=1, kernT='epanechnikov', kernS='epanechnikov', parallel=False, processes=20)[source]
Estimating the derivative of the dose-response curve via Nadaraya-Watson conditional CDF estimator.
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points. (Default: t_eval=None. Then, t_eval=X[:,0], which consists of the observed treatment variables.)
h_bar (float) – The bandwidth parameters for the Nadaraya-Watson conditional CDF estimator. (Default: h_bar=None. Then, the Silverman’s rule of thumb is applied. See Chen et al.(2016) for details.)
kernT_bar (str) – The name of the kernel function for Nadaraya-Watson conditional CDF estimator. (Default: “gaussian”.)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative the conditional mean outcome function. (Default: deriv_ord=1. Then, it estimates the partial derivative of the conditional mean outcome function with respect to the treatment variable.)
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
parallel (boolean) – The indicator of whether the function should be parallel executed by multi-processing. (Default: parallel=False.)
processes (int) – The number of processes for parallel execution. (Default: processes=20.)
- Returns:
theta_C – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponse.DerivEffectBoot(Y, X, t_eval=None, boot_num=500, alpha=0.95, h_bar=None, kernT_bar='gaussian', h=None, b=None, C_h=7, C_b=3, print_bw=True, degree=2, deriv_ord=1, kernT='epanechnikov', kernS='epanechnikov', parallel=False, processes=20)[source]
Conduct inference on the derivative of the dose-response curve via Nadaraya-Watson conditional CDF estimator and nonparametric bootstrap.
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points. (Default: t_eval=None. Then, t_eval=X[:,0], which consists of the observed treatment variables.)
boot_num (int) – The number of bootstrapping times. (Default: bootstrap_num=500.)
alpha (float) – The confidence level of both the uniform confidence band and pointwise confidence interval. (Default: alpha=0.95.)
h_bar (float) – The bandwidth parameters for the Nadaraya-Watson conditional CDF estimator. (Default: h_bar=None. Then, the Silverman’s rule of thumb is applied. See Chen et al.(2016) for details.)
kernT_bar (str) – The name of the kernel function for Nadaraya-Watson conditional CDF estimator. (Default: “gaussian”.)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative the conditional mean outcome function. (Default: deriv_ord=1. Then, it estimates the partial derivative of the conditional mean outcome function with respect to the treatment variable.)
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
parallel (boolean) – The indicator of whether the function should be parallel executed by multi-processing. (Default: parallel=False.)
processes (int) – The number of processes for parallel execution. (Default: processes=20.)
- Returns:
theta_C ((m,)-array) – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
theta_C_boot ((m,)-array) – The estimated derivatives of the dose-response curve on bootstrap samples evaluated at points “t_eval”.
theta_alpha (float) – The width of the uniform confidence band.
theta_alpha_var ((m,)-array) – The widths of the pointwise confidence bands at evaluation points “t_eval”.
- npDoseResponse.npDoseResponse.IntegEst(Y, X, t_eval=None, h_bar=None, kernT_bar='gaussian', h=None, b=None, C_h=7, C_b=3, print_bw=True, degree=2, deriv_ord=1, kernT='epanechnikov', kernS='epanechnikov', parallel=False, processes=20)[source]
Estimating the dose-response curve via our integral estimator with linear interpolation approximation.
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points. (Default: t_eval=None. Then, t_eval=X[:,0].)
h_bar (float) – The bandwidth parameters for the Nadaraya-Watson conditional CDF estimator. (Default: h_bar=None. Then, the Silverman’s rule of thumb is applied. See Chen et al.(2016) for details.)
kernT_bar (str) – The name of the kernel function for Nadaraya-Watson conditional CDF estimator. (Default: “gaussian”.)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative the conditional mean outcome function. (Default: deriv_ord=1. Then, it estimates the partial derivative of the conditional mean outcome function with respect to the treatment variable.)
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
parallel (boolean) – The indicator of whether the function should be parallel executed by multi-processing. (Default: parallel=False.)
processes (int) – The number of processes for parallel execution. (Default: processes=20.)
- Returns:
m_est – The estimated dose-response curve evaluated at points “t_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponse.IntegEstBoot(Y, X, t_eval=None, boot_num=500, alpha=0.95, h_bar=None, kernT_bar='gaussian', h=None, b=None, C_h=7, C_b=3, print_bw=True, degree=2, deriv_ord=1, kernT='epanechnikov', kernS='epanechnikov', parallel=False, processes=20)[source]
Conduct inference on the dose-response curve via our integral estimator and nonparametric bootstrap.
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points. (Default: t_eval=None. Then, t_eval=X[:,0].)
boot_num (int) – The number of bootstrapping times. (Default: bootstrap_num=500.)
alpha (float) – The confidence level of both the uniform confidence band and pointwise confidence interval. (Default: alpha=0.95.)
h_bar (float) – The bandwidth parameters for the Nadaraya-Watson conditional CDF estimator. (Default: h_bar=None. Then, the Silverman’s rule of thumb is applied. See Chen et al.(2016) for details.)
kernT_bar (str) – The name of the kernel function for Nadaraya-Watson conditional CDF estimator. (Default: “gaussian”.)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative the conditional mean outcome function. (Default: deriv_ord=1. Then, it estimates the partial derivative of the conditional mean outcome function with respect to the treatment variable.)
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
parallel (boolean) – The indicator of whether the function should be parallel executed by multi-processing. (Default: parallel=False.)
processes (int) – The number of processes for parallel execution. (Default: processes=20.)
- Returns:
m_est ((m,)-array) – The estimated dose-response curve evaluated at points “t_eval”.
m_est_boot ((boot_num, m)-array) – The estimated dose-response curves (or their derivatives) on the bootstrap samples evaluated at points “t_eval”.
m_alpha (float) – The width of the uniform confidence band.
m_alpha_var ((m,)-array) – The widths of the pointwise confidence bands at evaluation points “t_eval”.
- npDoseResponse.npDoseResponse.LocalPolyReg(Y, X, x_eval=None, degree=2, deriv_ord=1, h=None, b=None, C_h=7, C_b=3, print_bw=True, kernT='epanechnikov', kernS='epanechnikov', h_lst=numpy.linspace, b_lst=numpy.linspace)[source]
(Partial) Local polynomial regression for estimating the conditional mean outcome function and its partial derivatives. We use higher order local monomials for the treatment variable and first-order local monomials for the confounding variables.
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
x_eval ((m,d+1)-array) – The coordinates of the m evaluation points. (Default: x_eval=None. Then, x_eval=X.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative the conditional mean outcome function. (Default: deriv_ord=1. Then, it estimates the partial derivative of the conditional mean outcome function with respect to the treatment variable.)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
h_lst ((k1,)-array and (k2,)-array) – Candidate searching values of h,b for LOOCV.
b_lst ((k1,)-array and (k2,)-array) – Candidate searching values of h,b for LOOCV.
- Returns:
Y_est – The estimated conditional mean outcome function or its partial derivatives evaluated at points “x_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponse.LocalPolyReg1D(Y, X, h=None, x_eval=None, degree=2, deriv_ord=0, kernel='epanechnikov')[source]
Local polynomial regression in one dimension.
- Parameters:
Y ((m,)-array) – The y coordinates of m data points.
X ((m,)-array) – The x coordinates of m data points.
h (float) – The bandwidth parameter. (Default: h=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used.)
x_eval ((k,)-array) – Vector of evaluation points. (Default: x_eval=None. Then, x_eval=X.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of derivatives of the regression function that are estimated. (Default: deriv_ord=0. Then, it is the usual local polynomial regression.)
- Returns:
Y_est – The estimated function or its derivatives by local polynomial regression evaluated at points “x_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponse.LocalPolyRegMain(Y, X, x_eval=None, degree=2, deriv_ord=1, h=None, b=None, kernT='epanechnikov', kernS='epanechnikov')[source]
Main function for computing the local polynomial regression.
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
x_eval ((m,d+1)-array) – The coordinates of the m evaluation points. (Default: x_eval=None. Then, x_eval=X.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative the conditional mean outcome function. (Default: deriv_ord=1. Then, it estimates the partial derivative of the conditional mean outcome function with respect to the treatment variable.)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables.
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables.
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
- Returns:
Y_est – The estimated conditional mean outcome function or its partial derivatives evaluated at points “x_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponse.LocalPolyReg_Fs(x_eval, Y, X, degree=2, deriv_ord=1, h=None, b=None, C_h=7, C_b=3, print_bw=True, kernT='epanechnikov', kernS='epanechnikov', h_lst=numpy.linspace, b_lst=numpy.linspace)[source]
(Partial) Local polynomial regression for estimating the conditional mean outcome function and its partial derivatives. We use higher order local monomials for the treatment variable and first-order local monomials for the confounding variables. (This function is for multi-process execution only.)
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
x_eval ((m,d+1)-array) – The coordinates of the m evaluation points. (Default: x_eval=None. Then, x_eval=X.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative the conditional mean outcome function. (Default: deriv_ord=1. Then, it estimates the partial derivative of the conditional mean outcome function with respect to the treatment variable.)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
h_lst ((k1,)-array and (k2,)-array) – Candidate searching values of h,b for LOOCV.
b_lst ((k1,)-array and (k2,)-array) – Candidate searching values of h,b for LOOCV.
- Returns:
Y_est – The estimated conditional mean outcome function or its partial derivatives evaluated at points “x_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponse.RegAdjust(Y, X, t_eval=None, h=None, b=None, C_h=7, C_b=3, print_bw=True, degree=2, deriv_ord=0, kernT='epanechnikov', kernS='epanechnikov', parallel=False, processes=20)[source]
Estimating the dose-response curve via simple integral estimator with linear interpolation approximation.
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points. (Default: t_eval=None. Then, t_eval=X[:,0].)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables. (Default: h=None, b=None. Then, the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999) is used with additional scaling factors C_h and C_b, respectively.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative of the conditional mean outcome function. (Default: deriv_ord=0. Then, it estimates the conditional mean outcome function itself.)
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
parallel (boolean) – The indicator of whether the function should be parallel executed by multi-processing. (Default: parallel=False.)
processes (int) – The number of processes for parallel execution. (Default: processes=20.)
- Returns:
m_est – The estimated dose-response curve (or its derivative) evaluated at points “t_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponseDR.DRCurve(Y, X, t_eval=None, est='RA', mu=None, condTS_type=None, condTS_mod=None, L=1, h=None, kern='epanechnikov', tau=0.001, h_cond=None, self_norm=True, print_bw=True)[source]
Dose-response curve estimation under the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points. (Default: t_eval=None. Then, t_eval=X[:,0], which consists of the observed treatment variables.)
est (str) – The type of the dose-response curve estimator. (Default: est=”RA”. Other choices include “IPW” and “DR”.)
mu (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The conditional mean outcome (or regression) model of Y given X.
condTS_type (str) – Specifying the model type for estimating the conditional density of the treatment variable T given the covariate vector S.
condTS_mod (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The regression model for estimating the conditional density of T given S.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset.
h (float) – The bandwidth parameter for the IPW/DR estimator. (Default: h=None. Then the Silverman’s rule of thumb is applied; see Chen et al.(2016) for details.)
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
tau (float) – The threshold value that lower bounds the estimated conditional density values. (Default: tau=0.001.)
h_cond (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
- Returns:
m_est ((m,)-array) – The estimated dose-response curve evaluated at points “t_eval”.
sd_est ((m,)-array (if est=”DR”)) – The estimated asymptotic standard deviation of the DR estimator evaluated at points “t_eval”.
- npDoseResponse.npDoseResponseDR.DRDR(Y, X, t_eval, mu, condTS_type, condTS_mod, L, h, kern='epanechnikov', tau=0.001, b=None, self_norm=True)[source]
Estimating the dose-response curve through the doubly robust (DR) form.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
mu (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The conditional mean outcome (or regression) model of Y given X.
condTS_type (str) – Specifying the model type for estimating the conditional density of the treatment variable T given the covariate vector S.
condTS_mod (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The regression model for estimating the conditional density of T given S.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset.
h (float) – The bandwidth parameter.
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
tau (float) – The threshold value that lower bounds the estimated conditional density values. (Default: tau=0.001.)
b (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
- Returns:
m_est ((m,)-array) – The estimated dose-response curve evaluated at points “t_eval”.
sd_est ((m,)-array) – The estimated asymptotic stdndard deviation of the DR estimator evaluated at points “t_eval”.
- npDoseResponse.npDoseResponseDR.IPWDR(Y, X, t_eval, condTS_type, condTS_mod, L, h, kern='epanechnikov', tau=0.001, b=None, self_norm=True)[source]
Estimating the dose-response curve through the inverse probability weighting (IPW) form.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
condTS_type (str) – Specifying the model type for estimating the conditional density of the treatment variable T given the covariate vector S.
condTS_mod (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The regression model for estimating the conditional density of T given S.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
h (float) – The bandwidth parameter.
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
tau (float) – The threshold value that lower bounds the estimated conditional density values. (Default: tau=0.001.)
b (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
- Returns:
m_est ((m,)-array) – The estimated dose-response curve evaluated at points “t_eval”.
cond_est_full ((n,)-array) – The estimated conditional density function of T given S evaluated at the n observed data points.
- npDoseResponse.npDoseResponseDR.RegAdjustDR(Y, X, t_eval, mu, L=1, multi_boot=False, B=1000)[source]
Estimating the dose-response curve through the regression adjustment (or G-computation) form.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
mu (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The conditional mean outcome (or regression) model of Y given X.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
multi_boot (boolean) – An indicator of whether the multiplier bootstrap will be run. (Default: multi_boot=False.)
B (int) – The number of bootstrapping times. (Default: B=1000.)
- Returns:
m_est ((m,)-array) – The estimated dose-response curve evaluated at points “t_eval”.
mu_boot ((B,m)-array) – The estimated dose-response curves on bootstrapping data evaluated at points “t_eval”. (Only return this quantity when “multi_boot=True”.)
- npDoseResponse.npDoseResponseDerivDR.DRDRDeriv(Y, X, t_eval, mu, condTS_type, condTS_mod, L, h, kern='epanechnikov', n_iter=1000, lr=0.01, tau=0.001, b=None, self_norm=True)[source]
Estimating the derivative of a dose-response curve through the doubly robust (DR) form by a PyTorch neural network model under the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
mu (a neural network class defined by PyTorch) – The conditional mean outcome (or regression) model of Y given X.
condTS_type (str) – Specifying the model type for estimating the conditional density of the treatment variable T given the covariate vector S.
condTS_mod (cikit-learn model or any python model that can use ".fit()" and ".predict()") – The regression model for estimating the conditional density of T given S.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
h (float) – The bandwidth parameter.
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
n_iter (int) – The number of iterations or training epochs of the neural network model. (Default: n_iter=1000.)
lr (float) – The learning rate (Default: lr=0.01.)
tau (float) – The threshold value that lower bounds the estimated conditional density values. (Default: tau=0.001.)
b (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
- Returns:
theta_est ((m,)-array) – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
sd_est ((m,)-array) – The estimated asymptotic stdndard deviation of the DR derivative estimator evaluated at points “t_eval”.
- npDoseResponse.npDoseResponseDerivDR.DRDRDerivBC(Y, X, t_eval, mu, L=1, h=None, kern='epanechnikov', n_iter=1000, lr=0.01, b=None, thres_val=0.75, self_norm=True)[source]
Estimating the derivative of a dose-response curve through the doubly robust (DR) form by a PyTorch neural network model without assuming the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
mu (a neural network class defined by PyTorch) – The conditional mean outcome (or regression) model of Y given X.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
h (float) – The bandwidth parameter. (Default: h=None. Then, the Silverman’s rule of thumb is applied; see Chen et al.(2016) for details.)
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
n_iter (int) – The number of iterations or training epochs of the neural network model. (Default: n_iter=1000.)
lr (float) – The learning rate (Default: lr=0.01.)
b (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
thres_val (float) – The threshold factor that is multiplied to the maximum conditional density values of S given T evaluated at the sample points. (Default: thres_val=0.75.)
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
- Returns:
theta_est ((m,)-array) – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
sd_est ((m,)-array) – The estimated asymptotic stdndard deviation of the DR derivative estimator evaluated at points “t_eval”.
- npDoseResponse.npDoseResponseDerivDR.DRDRDerivSKLearn(Y, X, t_eval, mu, condTS_type, condTS_mod, L, h, kern='epanechnikov', tau=0.001, b=None, delta=0.01, self_norm=True)[source]
Estimating the derivative of a dose-response curve through the doubly robust (DR) form under the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
mu (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The conditional mean outcome (or regression) model of Y given X.
condTS_type (str) – Specifying the model type for estimating the conditional density of the treatment variable T given the covariate vector S.
condTS_mod (cikit-learn model or any python model that can use ".fit()" and ".predict()") – The regression model for estimating the conditional density of T given S.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
h (float) – The bandwidth parameter.
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
n_iter (int) – The number of iterations or training epochs of the neural network model. (Default: n_iter=1000.)
lr (float) – The learning rate (Default: lr=0.01.)
tau (float) – The threshold value that lower bounds the estimated conditional density values. (Default: tau=0.001.)
b (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
delta (float) – The step value for computing the finite differences (or numerical partial differentiation) of the fitted regression model.
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
- Returns:
theta_est ((m,)-array) – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
sd_est ((m,)-array) – The estimated asymptotic stdndard deviation of the DR derivative estimator evaluated at points “t_eval”.
- npDoseResponse.npDoseResponseDerivDR.DRDerivCurve(Y, X, t_eval=None, est='RA', beta_mod=None, n_iter=1000, lr=0.01, condTS_type=None, condTS_mod=None, L=1, h=None, kern='epanechnikov', tau=0.001, h_cond=None, delta=0.01, self_norm=True, print_bw=True)[source]
Dose-response curve derivative estimation under the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points. (Default: t_eval=None. Then, t_eval=X[:,0], which consists of the observed treatment variables.)
est (str) – The type of the dose-response curve estimator. (Default: est=”RA”. Other choices include “IPW” and “DR”.)
beta_mod (PyTorch neural network class or scikit-learn model or any python)
".predict()" (model that can use ".fit()" and) – The conditional mean outcome (or regression) model of Y given X.
n_iter (int) – The number of iterations or training epochs of the neural network model. (Default: n_iter=1000.)
lr (float) – The learning rate (Default: lr=0.01.)
condTS_type (str) – Specifying the model type for estimating the conditional density of the treatment variable T given the covariate vector S.
condTS_mod (cikit-learn model or any python model that can use ".fit()" and ".predict()") – The regression model for estimating the conditional density of T given S.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
h (float) – The bandwidth parameter for the IPW/DR estimator. (Default: h=None. Then the Silverman’s rule of thumb is applied; see Chen et al.(2016) for details.)
tau (float) – The threshold value that lower bounds the estimated conditional density values. (Default: tau=0.001.)
h_cond (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
- Returns:
theta_est ((m,)-array) – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
sd_est ((m,)-array (if est=”DR”)) – The estimated asymptotic stdndard deviation of the DR derivative estimator evaluated at points “t_eval”.
- npDoseResponse.npDoseResponseDerivDR.IPWDRDeriv(Y, X, t_eval, condTS_type, condTS_mod, L, h, kern='epanechnikov', tau=0.001, b=None, self_norm=True)[source]
Estimating the derivative of a dose-response curve through the inverse probability weighting (IPW) form under the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
condTS_type (str) – Specifying the model type for estimating the conditional density of the treatment variable T given the covariate vector S.
condTS_mod (cikit-learn model or any python model that can use ".fit()" and ".predict()") – The regression model for estimating the conditional density of T given S.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
h (float) – The bandwidth parameter.
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
tau (float) – The threshold value that lower bounds the estimated conditional density values. (Default: tau=0.001.)
b (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
- Returns:
theta_est – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponseDerivDR.IPWDRDerivBC(Y, X, t_eval, L=1, h=None, kern='epanechnikov', b=None, thres_val=0.75, self_norm=True)[source]
Estimating the derivative of a dose-response curve through the inverse probability weighting (IPW) form without assuming the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
h (float) – The bandwidth parameter. (Default: h=None. Then, the Silverman’s rule of thumb is applied; see Chen et al.(2016) for details.)
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
b (float) – The bandwidth parameter for the kernel-smoothed conditional density estimation methods. (Default: b=None.)
thres_val (float) – The threshold factor that is multiplied to the maximum conditional density values of S given T evaluated at the sample points. (Default: thres_val=0.75.)
self_norm (boolean) – An indicator of whether the self-normalized version is implemented. (Default: self_norm=True.)
- Returns:
theta_est – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
- Return type:
(m,)-array
- class npDoseResponse.npDoseResponseDerivDR.NeurNet(*args: Any, **kwargs: Any)[source]
Bases:
Module
- npDoseResponse.npDoseResponseDerivDR.RADRDeriv(Y, X, t_eval, mu, L=1, n_iter=1000, lr=0.1, multi_boot=False, B=1000)[source]
Estimating the derivative of a dose-response curve through the regression adjustment (or G-computation) form by a PyTorch neural network model under the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
mu (a neural network class defined by PyTorch) – The conditional mean outcome (or regression) model of Y given X.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
n_iter (int) – The number of iterations or training epochs of the neural network model. (Default: n_iter=1000.)
lr (float) – The learning rate (Default: lr=0.01.)
multi_boot (boolean) – An indicator of whether the multiplier bootstrap will be run. (Default: multi_boot=False.)
B (int) – The number of bootstrapping times. (Default: B=1000.)
- Returns:
theta_est ((m,)-array) – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
mu_boot ((B,m)-array) – The estimated derivatives of the dose-response curves on bootstrapping data evaluated at points “t_eval”. (Only return this quantity when “multi_boot=True”.)
- npDoseResponse.npDoseResponseDerivDR.RADRDerivBC(Y, X, t_eval, mu, L=1, n_iter=1000, lr=0.01, h_bar=None, kernT_bar='gaussian', print_bw=False)[source]
Estimating the derivative of a dose-response curve through the regression adjustment (or G-computation) form by a PyTorch neural network model without assuming the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
mu (a neural network class defined by PyTorch) – The conditional mean outcome (or regression) model of Y given X.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
n_iter (int) – The number of iterations or training epochs of the neural network model. (Default: n_iter=1000.)
lr (float) – The learning rate (Default: lr=0.01.)
h_bar (float) – The bandwidth parameters for the Nadaraya-Watson conditional CDF estimator. (Default: h_bar=None. Then, the Silverman’s rule of thumb is applied. See Chen et al.(2016) for details.)
kernT_bar (str) – The name of the kernel function for Nadaraya-Watson conditional CDF estimator. (Default: “gaussian”.)
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=False.)
- Returns:
theta_C – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponseDerivDR.RADRDerivSKLearn(Y, X, t_eval, mu, L=1, delta=0.01)[source]
Estimating the derivative of a dose-response curve through the regression adjustment (or G-computation) form under the positivity condition.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
t_eval ((m,)-array) – The coordinates of the m evaluation points.
mu (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The conditional mean outcome (or regression) model of Y given X.
L (int) – The number of data folds for cross-fitting. When L<= 1, no cross-fittings are applied and the regression model is fitted on the entire dataset. (Default: L=1.)
delta (float) – The step value for computing the finite differences (or numerical partial differentiation) of the fitted regression model.
- Returns:
theta_est – The estimated derivative of the dose-response curve evaluated at points “t_eval”.
- Return type:
(m,)-array
- npDoseResponse.npDoseResponseDerivDR.train(mod, X_train, Y_train, lr=0.01, n_epochs=10, momentum=0.7, weight_decay=0, print_loss=True)[source]
Utility function for training the PyTorch neural network model via stochastic gradient descent.
- Parameters:
mod (python class) – The neural network class defined by PyTorch.
X_train ((n,d+1)-torch.Tensor) – The first column of “X_train” is the treatment/exposure variable, while the other d columns are the confounding variables of n observations.
Y_train ((n,)-torch.Tensor) – The outcome variables of n observations.
lr (float) – The learning rate (Default: lr=0.01.)
n_epochs (int) – The number of training epochs. (Default: n_epochs=10.)
momentum (float) – The momentum factor (Default: momentum=0.7.)
weight_decay (float) – The weight decay (L2 penalty) (Default: weight_decay=0.)
print_loss (boolean) – An indicator of whether the training loss will be printed to the console.
- Returns:
model – The fitted model instance of a neural network class defined by PyTorch.
- Return type:
python object
Implementations of Common Kernel Functions
- npDoseResponse.rbf.KernelRetrieval(name)[source]
Retrieving the kernel function, its second moment, and its variance based on the name.
- Parameters:
name (str) – The name of the kernel function.
- Returns:
kern_func (python function) – The kernel function.
sigmaK_sq (float) – The second moment of the kernel function.
K_sq (float) – The variance of the kernel function.
- npDoseResponse.rbf.bigaussian(t)[source]
Bigaussian kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.biweight(t)[source]
Biweight/quartic kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.cosine(t)[source]
Cosine kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.epanechnikov(t)[source]
Epanechnikov kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.gaussian(t)[source]
Gaussian kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.logistic(t)[source]
Logistic kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.rectangular(t)[source]
Rectangular/uniform kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.sigmoid(t)[source]
Sigmoid kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.silverman(t)[source]
Silverman kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
- npDoseResponse.rbf.triangular(t)[source]
Triangular kernel function.
- Parameters:
t (float or (n,)-array) – The query points.
- Returns:
res – The kernel values evaluated at the query points.
- Return type:
float or (n,)-array
Utility Functions
- npDoseResponse.utils.BndKern(x_qry, kern, deriv_ord=0, alpha=1, bnd='left')[source]
Generalized jackknife boundary kernel.
- Parameters:
x_qry ((m,)-array) – The coordinates of m query points in the 1-dimensional Euclidean space.
kern (python function) – The kernel function.
deriv_ord (int) – The order of the derivative estimator. (Default: deriv_ord=0, which is for nonparametric density or curve estimation.)
alpha (float) –
The truncated proportion of the kernel support (0 <= alpha <= 1). (Default: alpha=1, which recovers the original kernel function for
the interior points.)
bnd (str) – Indicator of whether the input point is within the left or right boundary of the support. (Default: bnd=’left’.)
- Returns:
res – The boundary kernel function evaluated at m query points.
- Return type:
(m,)-array
- npDoseResponse.utils.CondDenEst(Y, X, reg_mod, y_eval=None, x_eval=None, kern='gaussian', b=None, poly_ext=False)[source]
Conditional density estimation via nonparametric regression on the kernel-smoothed outcome variables.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d)-array) – The d-dimensional covariates of n observations.
reg_mod (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The conditional mean outcome (or regression) model of Y given X.
y_eval ((m,)-array) – The outcome variables on which we evaluate the estimated conditional densities.
x_eval ((m,d)-array) – The covariates on which we evaluate the estimated conditional densities.
kern (str) – The name of the kernel function. (Default: kern=”gaussian”.)
b (float) – The bandwidth parameter for KDE. (Default: b=None.)
poly_ext (boolean) – The indicator of whether polynomial features are generated from the current covariates. (Default: poly_ext=False.)
- Returns:
cond_est – The estimated conditional densities at the m query points.
- Return type:
(m,)-array
- npDoseResponse.utils.CondDenEstKDE(Y, X, reg_mod, y_eval=None, x_eval=None, kern='epanechnikov', b=None)[source]
Conditional density estimation by applying the kernel density estimator (KDE) on the regression residuals.
- Parameters:
Y ((n,)-array) – The outcome variables of n observations.
X ((n,d)-array) – The d-dimensional covariates of n observations.
reg_mod (scikit-learn model or any python model that can use ".fit()" and ".predict()") – The conditional mean outcome (or regression) model of Y given X.
y_eval ((m,)-array) – The outcome variables at which we evaluate the estimated conditional densities.
x_eval ((m,d)-array) – The covariates at which we evaluate the estimated conditional densities.
kern (str) – The name of the kernel function. (Default: kern=”epanechnikov”.)
b (float) – The bandwidth parameter for KDE. (Default: b=None.)
- Returns:
cond_est – The estimated conditional densities at the m query points.
- Return type:
(m,)-array
- npDoseResponse.utils.HatMatrix(X, degree=2, deriv_ord=1, h=None, b=None, print_bw=True, kernT='epanechnikov', kernS='epanechnikov')[source]
Compute the hat matrix of the local polynomial regression when it is viewed as a linear smoother.
- Parameters:
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
degree (int) – Degree of local polynomials. (Default: degree=2.)
deriv_ord (int) – The order of the estimated derivative the conditional mean outcome function. (Default: deriv_ord=1. Then, it estimates the partial derivative of the conditional mean outcome function with respect to the treatment variable.)
h (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables.
b (float) – The bandwidth parameters for the treatment/exposure variable and confounding variables.
print_bw (boolean) – The indicator of whether the current bandwidth parameters should be printed to the console. (Default: print_bw=True.)
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
- Returns:
hat_mat – The hat matrix.
- Return type:
(n,n)-array
- npDoseResponse.utils.KDE(x, data, kern='gaussian', h=None)[source]
The d-dimensional Euclidean kernel density estimator.
- Parameters:
x ((m,d)-array) – The coordinates of m query points in the d-dim Euclidean space.
data ((n,d)-array) – The coordinates of n random sample points in the d-dimensional Euclidean space.
kern (str) – The name of the kernel function. (Default: “gaussian”.)
h (float) – The bandwidth parameter. (Default: h=None. Then the Silverman’s rule of thumb is applied. See Chen et al.(2016) for details.)
- Returns:
f_hat – The corresponding kernel density estimates at m query points.
- Return type:
(m,)-array
- npDoseResponse.utils.KDE1D(x, data, kern='epanechnikov', h=None)[source]
One-dimensional kernel density estimation with generalized jackknife boundary corrections (Jones 1993).
- Parameters:
x ((m,)-array) – The coordinates of m query points in the 1-dim Euclidean space.
data ((n,)-array) – The coordinates of n random sample points in the d-dimensional Euclidean space.
kern (str) – The name of the kernel function. (Default: “epanechnikov”.)
h (float) – The bandwidth parameter. (Default: h=None. Then the Silverman’s rule of thumb is applied; see Chen et al.(2016) for details.)
- Returns:
f_hat – The corresponding kernel density estimates at m query points.
- Return type:
(m,)-array
- npDoseResponse.utils.RoTBWLocalPoly(Y, X, kernT='epanechnikov', kernS='epanechnikov', C_h=10, C_b=15)[source]
Compute the rule-of-thumb bandwidth selector in Eq.(A1) of Yang and Tschernig (1999).
- Parameters:
Y ((n,)-array) – The outcomes of n observations.
X ((n,d+1)-array) – The first column of X is the treatment/exposure variable, while the other d columns are confounding variables of n observations.
kernT (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
kernS (str) – The names of kernel functions for the treatment/exposure variable and confounding variables. (Default: “epanechnikov”.)
C_h (float) – The scaling factors for the rule-of-thumb bandwidth parameters. (Default: C_h=7, C_b=3.)
C_b (float) – The scaling factors for the rule-of-thumb bandwidth parameters. (Default: C_h=7, C_b=3.)
- Returns:
h (float) – The rule-of-thumb bandwidth parameter for the treatment/exposure variable.
b ((d,)-array) – The rule-of-thumb bandwidth vector for the confounding variables.