causal_falsify.algorithms.transport module
- class causal_falsify.algorithms.transport.TransportabilityTest(cond_indep_test='kcit_rbf', max_sample_size=inf, seed=None)[source]
Bases:
AbstractFalsificationAlgorithm- subsample_data(outcome, source, covariates, treatment)[source]
Subsample data to limit the number of samples while preserving the source distribution.
- Parameters:
outcome (np.ndarray of shape (n_samples, 1)) – Outcome variable for each sample.
source (np.ndarray of shape (n_samples, 1)) – Source indicator for each sample.
covariates (np.ndarray of shape (n_samples, n_covariates)) – Observed covariates for each sample.
treatment (np.ndarray of shape (n_samples, 1)) – Treatment assignment for each sample.
- Return type:
tuple[ndarray,ndarray,ndarray,ndarray]- Returns:
outcome_sub (np.ndarray of shape (n_subsamples, 1)) – Subsampled outcomes.
source_sub (np.ndarray of shape (n_subsamples, 1)) – Subsampled source indicators.
covariates_sub (np.ndarray of shape (n_subsamples, n_covariates)) – Subsampled covariates.
treatment_sub (np.ndarray of shape (n_subsamples, 1)) – Subsampled treatment assignments.
Notes
The method ensures that each source is represented approximately
proportionally to its frequency in the original data. - If the total number of selected samples exceeds self.max_sample_size_test, a random subset of the selected samples is drawn to enforce the limit.
- test(data, covariate_vars, treatment_var, outcome_var, source_var)[source]
Perform falsification test for joint test of unconfoundedness and transportability.
- Parameters:
data (pd.DataFrame) – DataFrame containing all required columns.
covariate_vars (List[str]) – Covariate column names to condition on.
treatment_var (str) – Treatment column name.
outcome_var (str) – Outcome column name.
source_var (str) – Source/environment indicator column name.
- Returns:
p-value of the test; low p-value implies unmeasured confounding may be present.
- Return type:
float