Skip to content
6 changes: 0 additions & 6 deletions R/figures/fig1.R
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ create_figure_1 <- function(
raw_ct_2019,
health_weighted,
refining_mortality,
labor_2019,
ca_regions,
raw_pop_income_2021,
cpi2020,
Expand Down Expand Up @@ -1554,11 +1553,6 @@ create_figure_1 <- function(
## merge counties to census tracts
## -----------------------------------------------------------------

# ## join with spatial data 2019
# census_tract_labor_2019_sp <- raw_ct_2019 |>
# rename(census_tract = GEOID) |>
# left_join(census_tract_labor_2020)

# Create the expanded data
census_tracts_l_expanded <- expand_grid(
census_tracts,
Expand Down
24 changes: 12 additions & 12 deletions R/figures/health_labor_revised_figs.R
Original file line number Diff line number Diff line change
Expand Up @@ -1300,8 +1300,8 @@ plot_npv_health_labor <- function(
x = "GHG emissions reduction (%, 2045 vs 2019)"
) +
scale_y_continuous(
limits = c(0, 50),
breaks = seq(0, 50, by = 10)
limits = c(0, 40),
breaks = seq(0, 40, by = 10)
) +
xlim(0, 80) +
scale_color_manual(
Expand Down Expand Up @@ -1380,8 +1380,8 @@ plot_npv_health_labor <- function(
x = "GHG emissions reduction (%, 2045 vs 2019)"
) +
scale_y_continuous(
limits = c(-50, 0),
breaks = seq(-50, 0, by = 10)
limits = c(-40, 0),
breaks = seq(-40, 0, by = 10)
) +
xlim(0, 80) +
scale_color_manual(
Expand Down Expand Up @@ -1451,8 +1451,8 @@ plot_npv_health_labor <- function(
x = "GHG emissions reduction (%, 2045 vs 2019)"
) +
scale_y_continuous(
limits = c(-50, 0),
breaks = seq(-50, 0, by = 10)
limits = c(-40, 0),
breaks = seq(-40, 0, by = 10)
) +
xlim(0, 80) +
scale_color_manual(
Expand Down Expand Up @@ -1523,8 +1523,8 @@ plot_npv_health_labor <- function(
x = "GHG emissions reduction (%, 2045 vs 2019)"
) +
scale_y_continuous(
limits = c(-50, 0),
breaks = seq(-50, 0, by = 10)
limits = c(-40, 0),
breaks = seq(-40, 0, by = 10)
) +
xlim(0, 80) +
scale_color_manual(
Expand Down Expand Up @@ -1580,7 +1580,7 @@ plot_npv_health_labor <- function(
y = NULL,
x = "GHG emissions reduction (%, 2045 vs 2019)"
) +
ylim(-50, 0) +
ylim(-40, 0) +
xlim(0, 80) +
scale_color_manual(
values = refin_colors,
Expand Down Expand Up @@ -1655,7 +1655,7 @@ plot_npv_health_labor <- function(
y = NULL,
x = "GHG emissions reduction (%, 2045 vs 2019)"
) +
ylim(-50, 0) +
ylim(-40, 0) +
xlim(0, 80) +
scale_color_manual(
values = refin_colors,
Expand Down Expand Up @@ -4727,8 +4727,8 @@ plot_npv_health_labor_non_age_vsl <- function(
x = "GHG emissions reduction (%, 2045 vs 2019)"
) +
scale_y_continuous(
limits = c(-50, 0),
breaks = seq(-50, 0, by = 10)
limits = c(-40, 0),
breaks = seq(-40, 0, by = 10)
) +
xlim(0, 80) +
scale_color_manual(
Expand Down
140 changes: 91 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,48 +1,90 @@
# Distributional and climate impacts of low-carbon transition pathways for California's oil refining (repo)

## Setting up
This repo relies on the R package ``targets`` to maintain the pipeline of the scripts and the reproducibility of the project.
Install the package if you have not already done so:

```
This repo relies on the R package `targets` to maintain the pipeline of the scripts and the reproducibility of the project. Install the package if you have not already done so:

```
install.packages("targets")
```

Load the package:

```
```
library(targets)
```

All of the functions for the pipeline are in the ``R/`` folder.
To open the ``_targets.R`` script (which is where the workflow is built and specified), run:
All of the functions for the pipeline are in the `R/` folder. To open the `_targets.R` script (which is where the workflow is built and specified), run:

```
```
tar_edit()
```

## Changing the file directory

**IMPORTANT**: Before running the pipeline, one thing needs to be changed -- the path to the ``calepa-cn`` folder.
Look for the ``user`` target:

```
tar_target(name = user, "meas"),
```

And replace the name in the quotations above with a specified user.
## Changing important user-specific `targets`

**IMPORTANT**: Before running the pipeline, several things should be modified
to reflect user-specific specifications. The `setup_data_paths.R` file, which is
sources in the main `_targets.R` folder, should auto-configure paths. Make sure
the data folder has been moved into your main repo (note that this folder is not
tracked by git and therefore is not pulled into the repo).

Next, set the `confidential_data_access` target option based on whether or not
you have access to the confidential datasets. The code is as follows:

```
tar_target(name = confidential_data_access,
command = FALSE),
```

where `FALSE` indicates that the user does not have access and `TRUE` indicates
access. Users with access will have a subfolder within the data folder called
`confidential-data`.

Next, set the following target to reflect which values for `beta` and `cuf`
you are running:

```
## set module settings for specific run (cuf and beta)
tar_target(name = beta_scenario, command = "main"), ## UPDATE WITH ("main", "high", or "low")
tar_target(
name = beta,
command = ifelse(
beta_scenario == "low",
0.00422068,
ifelse(beta_scenario == "high", 0.00737932, 0.00582)
)
),

# Coefficient from Krewski et al (2009) for mortality impact
tar_target(name = ref_threshold, command = 0.6),
```

where the target `beta_scenario` can be "main", "low", or "high" and
will be used for file saving name conventions and sets the `beta` target below.

Finally, specify the `version` target to determine folder names for saving outputs:

```
# list save paths (UPDATE VERSION AS NEEDED)
tar_target(name = version, command = "test-no-conf-data"),

```
where in this example will create a folder with `test-no-conf-data` in the name.


## Using the repo to recreate the analysis

### Debugging the pipeline

In order to check the pipeline is engineered properly, run the following command:

```
```
tar_manifest(fields=command)
```

The output should look something like:
```

```
# A tibble: 90 × 2
name command
<chr> <chr>
Expand All @@ -59,19 +101,20 @@ The output should look something like:
# … with 80 more rows
# ℹ Use `print(n = ...)` to see more rows
```

If there are any issues (missing targets, bugs, etc), you should receive an error message.

### Running the pipeline

To build and run the pipeline (this will execute everything), run:

```
```
tar_make()
```

If you are running this for the first time, it should take a few minutes, but the outputs should look something like:

```
```
• start target ei_crude
• built target ei_crude [0.019 seconds]
• start target ei_diesel
Expand All @@ -87,90 +130,89 @@ If you are running this for the first time, it should take a few minutes, but th
...
```

Assuming none of the targets change, the next time(s) you run ``tar_make()``, ``targets`` will skip building targets that are already up-to-date.
Assuming none of the targets change, the next time(s) you run `tar_make()`, `targets` will skip building targets that are already up-to-date.

### Viewing and loading targets

If you are new to ``targets`` you might be confused that there are no objects in your environment. That's because the objects are stored locally in a folder called ``_targets`` (in your local repo).
If you are new to `targets` you might be confused that there are no objects in your environment. That's because the objects are stored locally in a folder called `_targets` (in your local repo).

But let's say you want to inspect a specific object, like ``dt_its``. If you want to just view it in your console, you can enter:
But let's say you want to inspect a specific object, like `dt_its`. If you want to just view it in your console, you can enter:

```
```
tar_read(dt_its)
```

And that should print the ``data.table``.
And that should print the `data.table`.

If you want to load the ``data.table`` into your environment, you can run the following instead:
If you want to load the `data.table` into your environment, you can run the following instead:

```
```
tar_load(dt_its)
```

You'll notice the object is in your environment.

You can also view plots. Running the following line should either load the plot in your Plots window or open a new window with the plot:

```
```
tar_read(fig_demand)
```

### Visualizing the pipeline

If you want to visualize the pipeline, run:

```
```
tar_visnetwork()
```

You'll notice the diagram is very small -- you can use your mouse to zoom in on the objects if you'd like. If you make changes to the targets/pipeline and run ``tar_visnetwork()`` before running ``tar_make``, you can see the colors of the objects change.
You'll notice the diagram is very small -- you can use your mouse to zoom in on the objects if you'd like. If you make changes to the targets/pipeline and run `tar_visnetwork()` before running `tar_make`, you can see the colors of the objects change.

## Example of target changes and impacts on the pipeline

Want an example of what happens when a target is changed? Here's an easy one:

1. Find the target ``ei_crude`` in ``_targets.R``:
1. Find the target `ei_crude` in `_targets.R`:

```
```
tar_target(name = ei_crude, command = 5.698)
```

2. Change the command value to something else, say 10 for example:
2. Change the command value to something else, say 10 for example:

```
```
tar_target(name = ei_crude, command = 10)
```

3. Save the script. Then run:
3. Save the script. Then run:

```
```
tar_visnetwork()
```

4. You'll see the diagram now looks different, with a few lines and points assigned a different color, representing "Outdated". These are the targets affected by the updated ``ei_crude``. Run ``tar_make()`` to rerun the pipeline with the new ``ei_crude`` value:
4. You'll see the diagram now looks different, with a few lines and points assigned a different color, representing "Outdated". These are the targets affected by the updated `ei_crude`. Run `tar_make()` to rerun the pipeline with the new `ei_crude` value:

```
```
tar_make()
```

In the outputs you'll see that the targets that are affected are being updated, while the ones that are unaffected are not being rebuilt.

If you run ``tar_visnetwork()`` everything should be up-to-date now in the diagram.
If you run `tar_visnetwork()` everything should be up-to-date now in the diagram.

**Remember to change the value of the target back to normal (by ctrl + z for example).**

## Output Structure and Git Tracking

This repository uses a standardized output structure defined in `structure.md` and `output_structure.csv`.
The `output_structure.csv` file specifies:
This repository uses a standardized output structure defined in `structure.md` and `output_structure.csv`. The `output_structure.csv` file specifies:

- `file_name`: The name of each output file
- `relative_path`: The path where the file should be saved (relative to `save_path`)
- `tracked`: Whether the file should be tracked in git (`YES` or `NO`)
- `file_name`: The name of each output file
- `relative_path`: The path where the file should be saved (relative to `save_path`)
- `tracked`: Whether the file should be tracked in git (`YES` or `NO`)

### Directory Structure

```text
``` text
outputs/
version/
iteration/
Expand All @@ -195,14 +237,14 @@ outputs/

Two utility scripts help manage output files and git tracking:

1. `update_gitignore.R`: Updates all `.gitignore` files based on `output_structure.csv`
2. `verify_file_paths.R`: Verifies that all files in `_targets.R` are saved in the correct locations
1. `update_gitignore.R`: Updates all `.gitignore` files based on `output_structure.csv`
2. `verify_file_paths.R`: Verifies that all files in `_targets.R` are saved in the correct locations

### File Saving Conventions

All file-producing targets in `_targets.R` should use the `simple_fwrite_repo` function:

```r
``` r
simple_fwrite_repo(
data = your_data,
folder_path = NULL, # Not needed when using save_path and file_type
Expand Down
Loading