Kwiz Computing Technologies Kwiz Computing Technologies
  • Home
  • Solutions
  • Environment
  • Technology
  • Kwiz Quants
  • Blog
  • About
  • Contact

R Plumber API: Auth, Rate Limiting & Deployment

enterprise-data-science
R-Plumber
Go beyond hello-world. Add JWT auth, API key management, rate limiting, and Docker deployment to your R Plumber API in production.
Author

Kwiz Computing Technologies

Published

April 23, 2026

Keywords

R Plumber API production, REST API R deployment, R Shiny API backend, enterprise data science, cloud deployment R

Your Plumber API works on your laptop. Then a teammate hits the endpoint without a token, a rogue script fires 400 requests per minute, and the whole thing falls over at 2 a.m. This gap between tutorial code and production code is where most R developers lose time.

This article covers what you actually need: API key authentication, JWT verification, rate limiting middleware, a clean app structure, Docker packaging, and deployment on a VPS. All with working R code.

Why Hello-World Plumber Tutorials Leave You Exposed

The standard Plumber quickstart looks like this:

library(plumber)

#* @get /predict
function(x) {
  list(result = as.numeric(x) * 2)
}

That is enough to demonstrate the concept. It is not enough for production. The endpoint accepts any request from anyone, has no error handling, logs nothing, and crashes if x is not numeric. On a shared server, one malformed request can take down every endpoint.

Production APIs need four things the tutorial skips: authentication, rate limiting, structured error handling, and a deployment target that stays running after you close your terminal.

Structuring a Plumber App for Production

Separate your concerns from the start. A flat api.R file gets unmanageable fast. Instead, use this layout:

api/
├── plumber.R        # Router definition and filter registration
├── auth.R           # Authentication helpers
├── rate_limit.R     # Rate limiting state and logic
├── endpoints/
│   ├── predict.R
│   └── data.R
└── run.R            # Entry point

Your run.R keeps the server start separate from the router logic:

library(plumber)
source("plumber.R")

pr() |>
  pr_run(host = "0.0.0.0", port = 8000)

Your plumber.R defines the router and registers filters before any endpoint:

library(plumber)

source("auth.R")
source("rate_limit.R")

#* @apiTitle Kwiz Analytics API
#* @apiVersion 1.0.0

# Filters run in registration order before every matching endpoint
#* @filter log_request
log_request <- function(req, res) {
  cat(format(Sys.time()), req$REQUEST_METHOD, req$PATH_INFO,
      req$HTTP_X_FORWARDED_FOR %||% req$REMOTE_ADDR, "\n")
  plumber::forward()
}

#* @filter authenticate
authenticate_request <- function(req, res) {
  authenticate(req, res)
  plumber::forward()
}

#* @filter rate_limit
rate_limit_request <- function(req, res) {
  check_rate_limit(req, res)
  plumber::forward()
}

# Include endpoint files
#* @plumber
function(pr) {
  pr |>
    pr_mount("/predict", plumb("endpoints/predict.R")) |>
    pr_mount("/data",    plumb("endpoints/data.R"))
}

The %||% operator is a simple null-coalescing helper. Add it to auth.R:

`%||%` <- function(a, b) if (!is.null(a) && nchar(a) > 0) a else b

API Key Authentication with Plumber Filters

Filters are the right place for authentication. They run before your endpoint function, so a rejected request never reaches your business logic.

A minimal API key system stores hashed keys in a flat file or database. This example uses a named list as an in-memory store. In a real deployment, read from a Postgres table or Redis on startup.

# auth.R
library(openssl)

# In production, load from environment or database at startup
VALID_KEYS <- list(
  "client_nairobi_erp"  = digest::digest("sk_live_abc123", algo = "sha256"),
  "client_mombasa_dash" = digest::digest("sk_live_xyz789", algo = "sha256")
)

authenticate <- function(req, res) {
  # Accept key from header (preferred) or query string (for testing only)
  api_key <- req$HTTP_X_API_KEY %||% req$args$api_key

  if (is.null(api_key) || api_key == "") {
    res$status <- 401
    stop(jsonlite::toJSON(list(error = "Missing API key"), auto_unbox = TRUE))
  }

  key_hash <- digest::digest(api_key, algo = "sha256")

  # Constant-time comparison to prevent timing attacks
  valid <- any(vapply(VALID_KEYS, function(h) identical(h, key_hash), logical(1)))

  if (!valid) {
    res$status <- 403
    stop(jsonlite::toJSON(list(error = "Invalid API key"), auto_unbox = TRUE))
  }

  # Attach client identity to request for downstream use
  req$client_id <- names(which(vapply(VALID_KEYS,
    function(h) identical(h, key_hash), logical(1))))[1]

  invisible(NULL)
}

The stop() pattern halts filter execution and returns the error body to the caller. Plumber catches the condition and sends the response.

JWT Authentication for User-Scoped Endpoints

API keys work well for service-to-service calls. For endpoints that serve individual users (common when a Shiny app is the frontend), JWTs let you encode identity and expiry without a database lookup on every request.

# jwt_auth.R
library(jose)

JWT_SECRET <- Sys.getenv("JWT_SECRET")  # Set in .Renviron or Docker env

issue_token <- function(user_id, role, expires_hours = 8) {
  payload <- list(
    sub  = user_id,
    role = role,
    iat  = as.integer(Sys.time()),
    exp  = as.integer(Sys.time()) + (expires_hours * 3600)
  )
  jwt_encode_hmac(payload, secret = JWT_SECRET)
}

verify_token <- function(req, res) {
  auth_header <- req$HTTP_AUTHORIZATION
  if (is.null(auth_header) || !startsWith(auth_header, "Bearer ")) {
    res$status <- 401
    stop(jsonlite::toJSON(list(error = "Missing bearer token"), auto_unbox = TRUE))
  }

  token <- sub("^Bearer ", "", auth_header)

  claims <- tryCatch(
    jwt_decode_hmac(token, secret = JWT_SECRET),
    error = function(e) NULL
  )

  if (is.null(claims)) {
    res$status <- 401
    stop(jsonlite::toJSON(list(error = "Invalid or expired token"), auto_unbox = TRUE))
  }

  if (as.integer(Sys.time()) > claims$exp) {
    res$status <- 401
    stop(jsonlite::toJSON(list(error = "Token expired"), auto_unbox = TRUE))
  }

  req$user_id <- claims$sub
  req$user_role <- claims$role
  invisible(NULL)
}

Your Shiny frontend calls /auth/login to get a token, then includes it in every subsequent API request as Authorization: Bearer <token>. This pairs naturally with the architecture described in the R Shiny hosting guide.

Rate Limiting with a Token Bucket

Rate limiting protects your API from both malicious abuse and accidental loops in client code. The token bucket algorithm is simple to implement in R: each client gets a bucket of N tokens that refills at a fixed rate. Each request consumes one token. When the bucket is empty, the request is rejected.

# rate_limit.R

# In-memory state: named list of lists(tokens, last_refill)
# For multi-process deployments, move this to Redis via the redux package
rate_limit_state <- new.env(hash = TRUE, parent = emptyenv())

BUCKET_CAPACITY  <- 60   # Max tokens per client
REFILL_RATE      <- 60   # Tokens added per minute
REFILL_INTERVAL  <- 60   # Seconds between refills

get_client_key <- function(req) {
  # Use API key identity if available, otherwise IP
  req$client_id %||% (req$HTTP_X_FORWARDED_FOR %||% req$REMOTE_ADDR)
}

check_rate_limit <- function(req, res) {
  client <- get_client_key(req)
  now    <- as.numeric(Sys.time())

  if (!exists(client, envir = rate_limit_state)) {
    assign(client,
           list(tokens = BUCKET_CAPACITY, last_refill = now),
           envir = rate_limit_state)
  }

  state <- get(client, envir = rate_limit_state)

  # Refill tokens based on elapsed time
  elapsed       <- now - state$last_refill
  refill_amount <- floor(elapsed / REFILL_INTERVAL) * REFILL_RATE
  new_tokens    <- min(BUCKET_CAPACITY, state$tokens + refill_amount)
  last_refill   <- if (refill_amount > 0) now else state$last_refill

  if (new_tokens < 1) {
    res$status <- 429
    res$setHeader("Retry-After", ceiling(REFILL_INTERVAL - (now - last_refill)))
    stop(jsonlite::toJSON(list(error = "Rate limit exceeded"), auto_unbox = TRUE))
  }

  # Consume one token
  assign(client,
         list(tokens = new_tokens - 1, last_refill = last_refill),
         envir = rate_limit_state)

  invisible(NULL)
}

This in-memory approach works for a single-process deployment. For multiple workers, replace rate_limit_state with Redis calls via the redux package: each check becomes HINCRBY and EXPIRE operations on a Redis hash keyed by client ID.

Packaging with Docker

Docker solves the “works on my machine” problem. It also makes deployment to any Linux VPS straightforward, including the affordable DigitalOcean droplets and Hetzner servers that many East African development teams use when AWS credit access is limited.

Create a Dockerfile at the project root:

FROM rocker/r-ver:4.4.0

# System dependencies
RUN apt-get update && apt-get install -y \
    libcurl4-openssl-dev \
    libssl-dev \
    libsodium-dev \
    && rm -rf /var/lib/apt/lists/*

# Install R packages
RUN Rscript -e "install.packages(c(
  'plumber', 'jsonlite', 'digest', 'jose', 'redux', 'logger'
), repos = 'https://cloud.r-project.org')"

WORKDIR /api

# Copy application code
COPY api/ .

EXPOSE 8000

CMD ["Rscript", "run.R"]

Build and run locally to verify:

docker build -t kwiz-api:latest .
docker run -p 8000:8000 \
  -e JWT_SECRET="your-secret-here" \
  kwiz-api:latest

Test authentication before shipping:

# Should return 401
curl http://localhost:8000/predict/score

# Should return data
curl -H "X-Api-Key: sk_live_abc123" http://localhost:8000/predict/score?x=42

Deploying to a VPS

A 2 GB RAM DigitalOcean droplet (about $12/month, payable in KES via M-Pesa through Safaricom or PayPal) handles moderate API traffic without the complexity of managed Kubernetes. The deployment process is the same on Hetzner, Vultr, or any other provider.

On the server, install Docker and docker-compose, then write a docker-compose.yml:

version: "3.8"
services:
  api:
    image: kwiz-api:latest
    restart: always
    ports:
      - "8000:8000"
    environment:
      - JWT_SECRET=${JWT_SECRET}
    volumes:
      - ./logs:/api/logs

  nginx:
    image: nginx:alpine
    restart: always
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf
      - /etc/letsencrypt:/etc/letsencrypt:ro

Your nginx.conf handles TLS termination and proxies requests to the Plumber container. Use certbot with Let’s Encrypt for free TLS certificates. This matches the Quarto cloud deployment pattern if you are hosting both your site and API on the same server.

Push your image to Docker Hub or GitHub Container Registry, then pull and start on the server:

docker-compose pull
docker-compose up -d

Set restart: always so the container comes back up after a server reboot or crash.

Connecting a Shiny Frontend

Once the API is running, your Shiny app calls it with httr2:

library(httr2)

get_prediction <- function(x, api_key) {
  request("https://api.yourdomain.com/predict/score") |>
    req_headers("X-Api-Key" = api_key) |>
    req_url_query(x = x) |>
    req_perform() |>
    resp_body_json()
}

This separation keeps your Shiny app stateless. The API handles computation and data access; Shiny handles the interface. Teams can update the model backend without redeploying the frontend, and multiple Shiny apps can share the same API. For teams already following this approach, see how it fits into a broader enterprise R architecture in our guide for business leaders on R.

Several enterprise clients we work with in Nairobi run this exact stack: a Plumber API on a VPS handling model inference, Shiny apps connecting to it from shinyapps.io, and API keys rotated quarterly. It keeps infrastructure costs low while maintaining clear security boundaries.

What to Build Next

The patterns here scale further. Add an /auth/token endpoint that issues short-lived JWTs in exchange for valid API keys. Wire the rate limiter to Redis so it works across multiple API containers behind a load balancer. Add a /health endpoint that checks database connectivity and returns a non-200 status if anything is broken, so your uptime monitor catches problems before users do.

If your team is exposing model predictions or data pipelines as HTTP endpoints and you want a review of the architecture or help with the deployment, Kwiz Computing works with data science teams across East Africa on exactly this kind of infrastructure.

What part of the Plumber-to-production gap is slowing your team down most? Authentication, deployment, or scaling?

© 2026 Kwiz Computing Technologies. All rights reserved.
Data Science & Technology | Environmental Analytics | Quantitative Finance

 

Built with Quarto