NEMA EIA Report in R: 2024 EMCA Reproducibility Guide

environmental-analytics

EIA

Use targets pipelines and parameterised Quarto reports to audit-proof EIA submissions under Kenya’s 2024 EMCA amendment.

Author

Kwiz Computing Technologies

Published

April 23, 2026

Keywords

EIA reproducibility Kenya, NEMA EIA report R, environmental analytics, targets R pipeline, EMCA compliance

A Category C EIA submitted for a road project along the Nairobi-Mombasa Economic Corridor comes back from NEMA in Q1 2025 with a query: the noise contour table in Section 4.3 does not match the figure in Annex B, and the baseline PM2.5 values differ by 12% between the executive summary and the technical appendix. The consultancy has six weeks to respond. Nobody can trace which version of the spreadsheet produced which number.

This is not a hypothetical. Several high-profile infrastructure EIAs were returned by NEMA in early 2025 for exactly these reasons, and the 2024 EMCA amendment has made the stakes higher: Category B and C reports now go through a mandatory peer-review stage where a licensed EIA Lead Expert can formally challenge any numerical claim in the document. Reproducibility has stopped being best practice. It is now a defence against report rejection.

Why NEMA’s 2024 Amendment Changes the Rules

Before the amendment, most NEMA review queries were handled informally between a firm and the review officer. The 2024 Environmental Management and Co-ordination Act amendment introduced a structured peer-review stage that puts a second qualified expert in the room, one with the authority to raise formal objections that delay approval.

The practical consequence is that any table, figure, or statistic in your report needs a clear chain of custody from raw data to final output. If your baseline water quality data lives in one Excel file, your calculations live in another, and your report was written by copying numbers into Word, you have no chain of custody. You have a liability.

Structured as a version-controlled R project with a {targets} pipeline and parameterised Quarto output, the same report becomes self-auditing: every number traces back to a function, every function traces back to a data file, and every data file carries a hash that confirms it has not been altered since the analysis ran.

The {targets} Package: What It Does for EIA Work

The {targets} package by Will Landau implements pipeline-based computation for R. Instead of running scripts sequentially, you declare a set of targets, each one a named object that depends on other targets or on raw inputs. {targets} builds only what has changed since the last run. It records the state of every intermediate object. It skips targets that are already up to date.

For an EIA project, this maps directly to the structure of the work: baseline data collection feeds into data cleaning, which feeds into statistical analysis, which feeds into the tables and figures that populate the report.

A minimal _targets.R file for a Category B EIA looks like this:

library(targets)
library(tarchetypes)

tar_option_set(
  packages = c("dplyr", "readxl", "sf", "ggplot2", "quarto")
)

list(
  # 1. Raw data ingestion
  tar_target(
    raw_air_quality,
    read_excel("data/raw/baseline_air_quality.xlsx"),
    format = "rds"
  ),
  tar_target(
    raw_water_quality,
    read_excel("data/raw/baseline_water_quality.xlsx"),
    format = "rds"
  ),

  # 2. Cleaning and validation
  tar_target(
    air_quality_clean,
    clean_air_quality(raw_air_quality),
    format = "rds"
  ),
  tar_target(
    water_quality_clean,
    clean_water_quality(raw_water_quality),
    format = "rds"
  ),

  # 3. Statistical summaries
  tar_target(
    air_summary_table,
    summarise_air_baseline(air_quality_clean),
    format = "rds"
  ),
  tar_target(
    water_summary_table,
    summarise_water_baseline(water_quality_clean),
    format = "rds"
  ),

  # 4. Parameterised report
  tar_quarto(
    eia_report,
    path = "report/eia_report.qmd",
    execute_params = list(
      project_name = "Nairobi Southern Bypass Phase 2",
      nema_ref = "NEMA/EIA/5/2/2026/001",
      reporting_period = "Q4 2025"
    )
  )
)

Running tar_make() in the project root builds the entire pipeline. Running it again after changing a single data file rebuilds only the targets that depend on that file, and nothing else. Every intermediate result is stored with a content hash. If someone changes baseline_air_quality.xlsx and does not tell you, the next tar_make() will detect the change and recompute downstream targets automatically.

Parameterised Quarto Reports

The report itself is a Quarto document with a params block in its YAML header. Parameters are injected at render time by {targets}, so the same .qmd template can produce outputs for different project phases, project areas, or reporting periods without any manual editing.

A simplified eia_report.qmd header:

---
title: "Environmental Impact Assessment: `r params$project_name`"
subtitle: "NEMA Reference: `r params$nema_ref`"
author: "Kwiz Computing Technologies"
date: "`r Sys.Date()`"
format:
  pdf:
    toc: true
    number-sections: true
    keep-tex: false
params:
  project_name: "Default Project"
  nema_ref: "NEMA/EIA/5/2/XXXX/XXX"
  reporting_period: "Q1 2026"
---

Inside the document, every table and figure calls a target object directly:

# In a code chunk: pull the pre-computed summary table
tar_read(air_summary_table) |>
  knitr::kable(
    caption = paste0(
      "Table 4.1: Baseline Air Quality Summary, ",
      params$reporting_period
    ),
    digits = 2
  )

Because tar_read() retrieves the stored target object rather than recomputing it, every figure and table in the rendered PDF is guaranteed to match the pipeline output exactly. The discrepancy that triggers NEMA queries, a number in the executive summary that differs from the technical appendix, cannot occur when both sections pull from the same target.

Project Structure for NEMA Submissions

A reproducible EIA project follows a standard directory layout that any colleague or reviewer can read and understand immediately:

project-nema-eia-001/
├── _targets.R              # Pipeline definition
├── _targets/               # Cached pipeline state (git-ignored)
├── R/
│   ├── clean_air_quality.R
│   ├── clean_water_quality.R
│   ├── summarise_air_baseline.R
│   └── summarise_water_baseline.R
├── data/
│   ├── raw/                # Original field data (never modified)
│   └── processed/          # Outputs written by the pipeline
├── report/
│   ├── eia_report.qmd      # Main report template
│   └── sections/           # Child documents for each chapter
├── output/                 # Rendered PDF reports
├── renv.lock               # Exact package versions (reproducible environment)
└── README.md               # Instructions to reproduce the analysis

The data/raw/ directory holds only original files, never overwritten by the pipeline. The R/ directory holds plain functions, one per file, each testable in isolation. The renv.lock file captures the exact version of every R package used, so the analysis runs the same way on a colleague’s machine six months later.

Version Control for EIA Audit Trails

Git provides the audit trail that NEMA’s peer-review stage now implicitly requires. Each commit records who changed what and when, giving you a documented history of every analytical decision.

For EIA work, adopt a simple commit convention:

data: add Q3 2025 air quality monitoring from Athi River station
analysis: update PM2.5 baseline method per NEMA guidance note 2024-07
report: revise Section 4.3 noise assessment following peer-review comments

This history answers the most common NEMA query, “How did this number change between the draft and final report?”, in seconds. You check out the relevant commit and re-render.

Store your _targets/ cache directory in .gitignore because it is large and reconstructable. Store your data/raw/ files in the repository if they are small enough, or document exactly where they come from and how to obtain them if they are large field datasets.

A note on sensitive project data: consult your firm’s data governance policy before committing client data to any repository. Many Kenyan consultancies use private GitHub or GitLab repositories hosted under their organisation account, which is adequate for EMCA compliance purposes.

Migrating from Excel-Based Workflows

Most Kenyan EIA consultancies do not start with a clean slate. They have existing Excel-based workflows, established templates, and staff who are not R users. The migration does not have to be immediate or total.

A practical three-phase approach:

Phase 1: Read, do not replace. Keep your Excel files as the primary data store. Write R functions that read from them using readxl or openxlsx. Run those functions through a {targets} pipeline. The team keeps working in Excel; R becomes the computation layer downstream of it.

Phase 2: Parameterise the report. Move one chapter of your standard NEMA report template into a Quarto document. Have it read from the {targets} pipeline. Render it alongside the Word document initially. Once the team trusts the output, the Quarto version becomes the primary.

Phase 3: Bring data entry into R. Once the pipeline is established, replace manual Excel data entry with validated R data entry forms or structured CSV templates. At this stage the full pipeline is reproducible from raw inputs.

The two packages that smooth this transition most are readxl for reading Excel files without converting them, and writexl for producing Excel outputs if a client or regulator specifically requires them. Neither requires Excel to be installed.

For biodiversity baseline data, our earlier post on GBIF Kenya data quality covers the specific quality checks that belong in the cleaning stage of your pipeline.

What NEMA Peer Reviewers Actually Look For

Conversations with practitioners who have been through the 2024 peer-review process point to three recurring flashpoints.

First, inconsistent summary statistics: a mean value reported differently in the executive summary, the baseline chapter, and the annexes. This is a copy-paste artefact that a {targets} pipeline eliminates entirely because every section pulls from the same computed object.

Second, undocumented data transformations: a baseline dataset that has been filtered or corrected without any record of the decision. The R/ functions in your project document every transformation in code, and Git records when and why the code changed.

Third, outdated figures: a map or chart that reflects an earlier version of the data because the analyst forgot to regenerate it after a late field correction. {targets} makes stale outputs structurally impossible: if the data changes, the pipeline marks downstream targets as outdated and rebuilds them on the next tar_make() call.

The Kenya open data portal and the previous post on ESIA reproducibility provide additional context on the data sources and principles that underpin this kind of workflow.

Where to Start

If you are preparing a Category B or C EIA and want to move toward a reproducible workflow before your next NEMA submission, three resources are worth your time: the {targets} user manual at books.ropensci.org/targets, the renv documentation for package environment management, and Quarto’s guide to parameterised reports at quarto.org/docs/computations/parameters.html.

The change is not primarily technical. The shift from an Excel-based workflow to a {targets} pipeline takes an experienced R practitioner one to two weeks for a typical Category B project. The harder change is procedural: agreeing within your consultancy that the pipeline, not the spreadsheet, is the authoritative version of the analysis.

Given that NEMA’s peer-review stage now means a qualified external expert can formally challenge any number in your report, that agreement has real commercial value. A rejected report costs more than a reproducible workflow.

Kwiz Computing Technologies provides environmental data science services for EIA and ESIA projects across East Africa. If your consultancy is scoping a move to reproducible R workflows for NEMA submissions, contact us to discuss your project.