2021-09-01
04-dags-exercises.qmd
)dagify()
. Write your assumption that smoking
causes cancer
as a formula.ggdag()
05:00
02-dags-exercises.qmd
)Ok, correlation != causation. But why not?
We want to know if x -> y
…
But other paths also cause associations
ggdag_paths()
Identify “backdoor” paths
tidy_dagitty()
on coffee_cancer_dag
to create a tidy DAG, then pass the results to dag_paths()
. What’s different about these data?ggdag_paths()
. (Just give it coffee_cancer_dag
rather than using dag_paths()
; the quick plot function will do that for you.) Remember, since we assume there is no causal path from coffee to lung cancer, any open paths must be confounding pathways.05:00
# A DAG with 4 nodes and 3 edges
#
# Exposure: coffee
# Outcome: cancer
#
# A tibble: 5 × 11
set name x y direction to xend yend
<chr> <chr> <dbl> <dbl> <fct> <chr> <dbl> <dbl>
1 1 addictive 0.616 -1.27 -> coff… 0.185 -0.127
2 1 addictive 0.616 -1.27 -> smok… 1.09 -2.52
3 1 cancer 1.52 -3.66 <NA> <NA> NA NA
4 1 coffee 0.185 -0.127 <NA> <NA> NA NA
5 1 smoking 1.09 -2.52 -> canc… 1.52 -3.66
# ℹ 3 more variables: circular <lgl>, label <chr>,
# path <chr>
We need to account for these open, non-causal paths
Randomization
Stratification, adjustment, weighting, matching, etc.
ggdag_adjustment_set()
to visualize the adjustment sets. Add the arguments use_labels = "label"
and text = FALSE
.lm()
or glm()
05:00
# A tibble: 500 × 4
addictive cancer coffee smoking
<dbl> <dbl> <dbl> <dbl>
1 0.569 3.11 -0.326 -1.29
2 0.411 1.52 0.330 -1.57
3 1.20 1.06 -0.557 -2.40
4 -0.782 -0.504 -0.148 0.376
5 0.0357 -0.709 -0.342 -1.53
6 1.96 1.05 -1.90 -0.823
7 1.13 0.211 -0.581 -0.534
8 0.697 0.892 -1.36 -0.267
9 -0.779 0.748 0.455 0.302
10 -1.13 0.930 0.568 0.742
# ℹ 490 more rows
Adjustment sets and domain knowledge
Conduct sensitivity analysis if you don’t have something important
Using prediction metrics
The 10% rule
Predictors of the outcome, predictors of the exposure
Forgetting to consider time-ordering (something has to happen before something else to cause it!)
Selection bias and colliders (more later!)
Incorrect functional form for confounders (e.g. BMI often non-linear)
Recreate the DAG we’ve been working with using time_ordered_coords()
, then visualize the DAG. You don’t need to use any arguments for this function, so coords = time_ordered_coords()
will do.
coffee_cancer_dag_to <- dagify(
cancer ~ smoking,
smoking ~ addictive,
coffee ~ addictive,
exposure = "coffee",
outcome = "cancer",
coords = time_ordered_coords(),
labels = c(
"coffee" = "Coffee",
"cancer" = "Lung Cancer",
"smoking" = "Smoking",
"addictive" = "Addictive \nBehavior"
)
)
#TODO: UPDATE LABELS ARGS
ggdag(coffee_cancer_dag_to, use_labels = "label", text = FALSE)