causal_falsify.algorithms.transport module

class causal_falsify.algorithms.transport.TransportabilityTest(cond_indep_test='kcit_rbf', max_sample_size=inf, seed=None)[source]

Bases: AbstractFalsificationAlgorithm

subsample_data(outcome, source, covariates, treatment)[source]

Subsample data to limit the number of samples while preserving the source distribution.

Parameters:
  • outcome (np.ndarray of shape (n_samples, 1)) – Outcome variable for each sample.

  • source (np.ndarray of shape (n_samples, 1)) – Source indicator for each sample.

  • covariates (np.ndarray of shape (n_samples, n_covariates)) – Observed covariates for each sample.

  • treatment (np.ndarray of shape (n_samples, 1)) – Treatment assignment for each sample.

Return type:

tuple[ndarray, ndarray, ndarray, ndarray]

Returns:

  • outcome_sub (np.ndarray of shape (n_subsamples, 1)) – Subsampled outcomes.

  • source_sub (np.ndarray of shape (n_subsamples, 1)) – Subsampled source indicators.

  • covariates_sub (np.ndarray of shape (n_subsamples, n_covariates)) – Subsampled covariates.

  • treatment_sub (np.ndarray of shape (n_subsamples, 1)) – Subsampled treatment assignments.

Notes

  • The method ensures that each source is represented approximately

proportionally to its frequency in the original data. - If the total number of selected samples exceeds self.max_sample_size_test, a random subset of the selected samples is drawn to enforce the limit.

test(data, covariate_vars, treatment_var, outcome_var, source_var)[source]

Perform falsification test for joint test of unconfoundedness and transportability.

Parameters:
  • data (pd.DataFrame) – DataFrame containing all required columns.

  • covariate_vars (List[str]) – Covariate column names to condition on.

  • treatment_var (str) – Treatment column name.

  • outcome_var (str) – Outcome column name.

  • source_var (str) – Source/environment indicator column name.

Returns:

p-value of the test; low p-value implies unmeasured confounding may be present.

Return type:

float