Memory Management for Large Simulations
Source:vignettes/memory-management.Rmd
memory-management.RmdIntroduction
Simulation studies in Bayesian modeling can consume large amounts of memory. When running thousands of tasks, each producing posterior draws, fitted model objects, predictions, and diagnostics, the accumulated data can exhaust available RAM. This vignette explains how to use bayesim’s retention system to control memory usage during large-scale simulations.
Consider a simulation with: - 50 data configurations - 5 model variants - 100 replicates per combination - 4,000 posterior draws per fit - 50 parameters tracked
This creates 25,000 tasks. If each fit object is 50 MB and each draws matrix is 20 MB, retaining everything would require over 1.7 TB of memory. We need a way to retain only what is necessary.
The Retention System
What Gets Stored
During simulation, each task produces several types of artifacts:
| Artifact | Description | Typical Size |
|---|---|---|
metrics |
Computed metric values (RMSE, bias, coverage, etc.) | Small (KB) |
diagnostics |
Convergence diagnostics (R-hat, ESS, divergences) | Small (KB) |
draws |
Posterior draws matrix | Medium (MB) |
predictions |
Predicted values for test/training data | Variable |
fit |
Raw fit object from the modeling backend | Large (MB-GB) |
data |
Input training and test data | Variable |
warnings |
Warning messages from fitting | Small (KB) |
The retain Parameter
The retain parameter in simulation_config()
controls which artifacts are kept in memory:
library(bayesim)
# Default retention: metrics and diagnostics
config <- simulation_config(
data_grid = data.frame(n = 100, sigma = 1),
fit_grid = data.frame(model = "linear"),
data_generator = function(data_spec, seed, task_ctx) {
# Note: seed is a scalar task seed and the task RNG stream is restored
# before each call.
n <- data_spec$n
x <- rnorm(n)
y <- 1 + 2 * x + rnorm(n, sd = data_spec$sigma)
list(
train = data.frame(y = y, x = x),
test = NULL,
response = "y",
true_params = c(intercept = 1, slope = 2, sigma = data_spec$sigma),
vars_of_interest = c("intercept", "slope", "sigma"),
references = c(intercept = 0, slope = 0, sigma = 1),
meta = list()
)
},
fitter = MockFitter(),
metrics = list(rmse_metric(), bias_metric()),
n_replicates = 10L,
seed = 42L,
retain = c("metrics", "diagnostics") # Default
)Retention Profiles
bayesim provides three common retention profiles for typical use
cases. You can use the resolve_retention() function to get
the explicit character vector for each profile:
Minimal Profile: Keep Only Essential Metrics
Use this for very large simulations where you only need summary statistics:
# Get the minimal retention profile
minimal_retain <- resolve_retention("minimal")
print(minimal_retain)
#> [1] "metrics"
config_minimal <- simulation_config(
data_grid = data.frame(n = 100, sigma = 1),
fit_grid = data.frame(model = "linear"),
data_generator = function(data_spec, seed, task_ctx) {
# Note: seed is a scalar task seed and the task RNG stream is restored
# before each call.
n <- data_spec$n
x <- rnorm(n)
y <- 1 + 2 * x + rnorm(n, sd = data_spec$sigma)
list(
train = data.frame(y = y, x = x),
test = NULL,
response = "y",
true_params = c(intercept = 1, slope = 2, sigma = data_spec$sigma),
vars_of_interest = c("intercept", "slope", "sigma"),
references = c(intercept = 0, slope = 0, sigma = 1),
meta = list()
)
},
fitter = MockFitter(),
metrics = list(rmse_metric(), bias_metric()),
n_replicates = 10L,
seed = 42L,
retain = minimal_retain # Only metrics
)The "minimal" profile retains only metrics,
giving you the smallest memory footprint.
Standard Profile: Keep Metrics, Diagnostics, and Warnings
This is the default profile, balancing information with memory usage:
# Get the standard retention profile
standard_retain <- resolve_retention("standard")
print(standard_retain)
#> [1] "metrics" "diagnostics" "warnings"
config_standard <- simulation_config(
data_grid = data.frame(n = 100, sigma = 1),
fit_grid = data.frame(model = "linear"),
data_generator = function(data_spec, seed, task_ctx) {
# Note: seed is a scalar task seed and the task RNG stream is restored
# before each call.
n <- data_spec$n
x <- rnorm(n)
y <- 1 + 2 * x + rnorm(n, sd = data_spec$sigma)
list(
train = data.frame(y = y, x = x),
test = NULL,
response = "y",
true_params = c(intercept = 1, slope = 2, sigma = data_spec$sigma),
vars_of_interest = c("intercept", "slope", "sigma"),
references = c(intercept = 0, slope = 0, sigma = 1),
meta = list()
)
},
fitter = MockFitter(),
metrics = list(rmse_metric(), bias_metric()),
n_replicates = 10L,
seed = 42L,
retain = standard_retain
)The "standard" profile retains metrics,
diagnostics, and warnings, useful for checking
convergence and capturing any warning messages across your
simulation.
Debug Profile: Keep Everything
Use this for debugging or small studies where you need full access to all artifacts:
# Get the debug retention profile
debug_retain <- resolve_retention("debug")
print(debug_retain)
#> [1] "metrics" "diagnostics" "draws" "predictions" "fit"
#> [6] "data" "warnings"
config_debug <- simulation_config(
data_grid = data.frame(n = 100, sigma = 1),
fit_grid = data.frame(model = "linear"),
data_generator = function(data_spec, seed, task_ctx) {
# Note: seed is a scalar task seed and the task RNG stream is restored
# before each call.
n <- data_spec$n
x <- rnorm(n)
y <- 1 + 2 * x + rnorm(n, sd = data_spec$sigma)
list(
train = data.frame(y = y, x = x),
test = NULL,
response = "y",
true_params = c(intercept = 1, slope = 2, sigma = data_spec$sigma),
vars_of_interest = c("intercept", "slope", "sigma"),
references = c(intercept = 0, slope = 0, sigma = 1),
meta = list()
)
},
fitter = MockFitter(),
metrics = list(rmse_metric(), bias_metric()),
n_replicates = 10L,
seed = 42L,
retain = debug_retain # Everything
)The "debug" profile retains all artifacts:
metrics, diagnostics, draws,
predictions, fit, data, and
warnings.
Custom Retention
For fine-grained control, specify a custom character vector:
# Keep metrics, diagnostics, and draws, but not fit objects or data
config_custom <- simulation_config(
data_grid = data.frame(n = 100, sigma = 1),
fit_grid = data.frame(model = "linear"),
data_generator = function(data_spec, seed, task_ctx) {
# Note: seed is a scalar task seed and the task RNG stream is restored
# before each call.
n <- data_spec$n
x <- rnorm(n)
y <- 1 + 2 * x + rnorm(n, sd = data_spec$sigma)
list(
train = data.frame(y = y, x = x),
test = NULL,
response = "y",
true_params = c(intercept = 1, slope = 2, sigma = data_spec$sigma),
vars_of_interest = c("intercept", "slope", "sigma"),
references = c(intercept = 0, slope = 0, sigma = 1),
meta = list()
)
},
fitter = MockFitter(),
metrics = list(rmse_metric(), bias_metric(), coverage_metric()),
n_replicates = 10L,
seed = 42L,
retain = c("metrics", "diagnostics", "draws", "warnings")
)Valid retention options are: - "metrics" - Always
retained (required) - "diagnostics" - Convergence
diagnostics - "draws" - Posterior draws matrix -
"predictions" - Predicted values - "fit" - Raw
fit object - "data" - Input data - "warnings"
- Warning messages
When to Use What
Small Studies (< 100 Tasks)
For small studies, you can safely use the debug profile:
# Small study: 2 data configs × 1 model × 20 replicates = 40 tasks
small_config <- simulation_config(
data_grid = data.frame(n = c(100, 500), sigma = 1),
fit_grid = data.frame(model = "linear"),
data_generator = my_data_generator,
fitter = my_fitter,
metrics = list(rmse_metric(), coverage_metric()),
n_replicates = 20L,
seed = 42L,
retain = resolve_retention("debug") # Keep everything for detailed analysis
)With fewer than 100 tasks, even retaining all artifacts typically uses less than 5 GB of memory.
Medium Studies (100-1000 Tasks)
For medium-sized studies, use the standard profile to keep diagnostics:
# Medium study: 10 data configs × 2 models × 50 replicates = 1,000 tasks
medium_config <- simulation_config(
data_grid = data.frame(
n = rep(c(100, 200, 500), each = 2),
sigma = rep(c(0.5, 1), 3)
),
fit_grid = data.frame(model = c("baseline", "full")),
data_generator = my_data_generator,
fitter = my_fitter,
metrics = list(rmse_metric(), bias_metric(), coverage_metric()),
n_replicates = 50L,
seed = 42L,
retain = resolve_retention("standard")
)The "standard" profile keeps enough information to
monitor convergence while avoiding the memory overhead of storing all
fit objects.
Large Studies (> 1000 Tasks)
For large studies, use the minimal profile or a custom profile:
# Large study: 20 data configs × 5 models × 100 replicates = 10,000 tasks
large_config <- simulation_config(
data_grid = expand.grid(
n = c(50, 100, 200, 500),
sigma = c(0.5, 1, 2),
effect = c(0.5, 1.0)
),
fit_grid = data.frame(model = c("m1", "m2", "m3", "m4", "m5")),
data_generator = my_data_generator,
fitter = my_fitter,
metrics = list(rmse_metric(), coverage_metric()),
n_replicates = 100L,
seed = 42L,
retain = resolve_retention("minimal"), # Only metrics
result_path = "large_simulation",
checkpoint_every = 100L
)With "minimal", you keep only the computed metrics,
reducing memory usage to megabytes even for thousands of tasks.
Checkpointing for Large Studies
When running large simulations, combine retention profiles with checkpointing to manage both memory and execution time:
# Very large study with checkpointing
huge_config <- simulation_config(
data_grid = expand.grid(
n = c(100, 500, 1000),
sigma = c(0.5, 1, 2),
effect = c(0.5, 1.0, 2.0)
),
fit_grid = data.frame(model = c("baseline", "complex", "spline")),
data_generator = my_data_generator,
fitter = my_fitter,
metrics = list(rmse_metric(), bias_metric(), coverage_metric()),
n_replicates = 200L,
seed = 42L,
retain = resolve_retention("minimal"),
result_path = "huge_simulation_results", # Enable checkpointing
checkpoint_every = 50L # Save every 50 tasks
)
# Run with resume capability
result <- run_simulation(huge_config, resume = "auto")How Checkpointing Manages Memory
When checkpointing is enabled, bayesim implements memory-bounded execution:
- Task results are accumulated in memory during execution
- When
chunk_sizetasks complete (orcheckpoint_every, whichever comes first), a checkpoint is written to disk - After checkpointing, heavy objects in task results are cleared from memory while keeping lightweight summary data (metrics, diagnostics)
- Only pending tasks and lightweight summaries remain in memory
Memory usage stays bounded by chunk_size tasks,
regardless of total study size.
You can control the memory bound with the chunk_size
parameter:
# Tighter memory control - checkpoint and clear every 25 tasks
tight_memory_config <- simulation_config(
data_grid = expand.grid(n = c(100, 500), sigma = c(0.5, 1)),
fit_grid = data.frame(model = c("m1", "m2")),
data_generator = my_data_generator,
fitter = my_fitter,
metrics = list(rmse_metric(), coverage_metric()),
n_replicates = 100L,
seed = 42L,
retain = resolve_retention("minimal"),
result_path = "tight_memory_simulation",
checkpoint_every = 100L, # Write checkpoint every 100 tasks
chunk_size = 25L # Clear memory after 25 tasks
)When chunk_size is smaller than
checkpoint_every, the system checkpoints early (at the
chunk_size threshold) to free memory. When
chunk_size is larger, checkpoints happen at
checkpoint_every intervals as usual.
max_in_memory remains available as a compatibility
alias.
Resuming Efficiently
If a simulation is interrupted, resuming continues from the last checkpoint without reloading previous full results:
# First run - gets interrupted
result <- run_simulation(huge_config, resume = "auto")
# Later - resume from checkpoint
result <- resume_simulation("huge_simulation_results")The checkpoint system stores summary data (metrics, diagnostics) needed for analysis. Heavy high-cardinality metric payloads are externalized into artifact files under the result directory rather than widening the main summary table indefinitely.
Example: Comparing Retention Profiles
Let’s compare memory usage across different retention profiles:
# Data generator for comparison
demo_generator <- function(data_spec, seed, task_ctx) {
# Note: seed is a scalar task seed and the task RNG stream is restored
# before each call.
n <- data_spec$n
x <- rnorm(n)
y <- data_spec$intercept + data_spec$slope * x + rnorm(n, sd = data_spec$sigma)
list(
train = data.frame(y = y, x = x),
test = NULL,
response = "y",
true_params = c(
intercept = data_spec$intercept,
slope = data_spec$slope,
sigma = data_spec$sigma
),
vars_of_interest = c("intercept", "slope", "sigma"),
references = c(intercept = 0, slope = 0, sigma = 1),
meta = list()
)
}
# Run with minimal retention
config_minimal <- simulation_config(
data_grid = data.frame(
n = 100,
intercept = 1,
slope = 2,
sigma = 1
),
fit_grid = data.frame(model = "linear"),
data_generator = demo_generator,
fitter = MockFitter(),
metrics = list(rmse_metric(), bias_metric(), coverage_metric()),
n_replicates = 20L,
seed = 42L,
retain = resolve_retention("minimal")
)
result_minimal <- run_simulation(config_minimal, progress = FALSE)
#> ℹ Starting simulation with 20 tasks
# Run with standard retention
config_standard <- simulation_config(
data_grid = data.frame(
n = 100,
intercept = 1,
slope = 2,
sigma = 1
),
fit_grid = data.frame(model = "linear"),
data_generator = demo_generator,
fitter = MockFitter(),
metrics = list(rmse_metric(), bias_metric(), coverage_metric()),
n_replicates = 20L,
seed = 42L,
retain = resolve_retention("standard")
)
result_standard <- run_simulation(config_standard, progress = FALSE)
#> ℹ Starting simulation with 20 tasks
# Compare summary outputs
cat("=== Minimal Retention Summary ===\n")
#> === Minimal Retention Summary ===
print(head(result_minimal$summary, 3))
#> task_id status rmse__value rmse__n_obs bias__value coverage__mean
#> 1 d001_f001_r00001 success 2.160240 100 0.42322202 1
#> 2 d001_f001_r00002 success 2.107857 100 0.42394617 1
#> 3 d001_f001_r00003 success 2.401193 100 0.09284687 1
#> coverage__by_param__intercept coverage__by_param__slope
#> 1 1 1
#> 2 1 1
#> 3 1 1
#> coverage__by_param__sigma timing_total rep_idx data_n data_intercept
#> 1 1 0.035192728 1 100 1
#> 2 1 0.047021866 2 100 1
#> 3 1 0.005603313 3 100 1
#> data_slope data_sigma fit_model
#> 1 2 1 linear
#> 2 2 1 linear
#> 3 2 1 linear
cat("\n=== Standard Retention Summary ===\n")
#>
#> === Standard Retention Summary ===
print(head(result_standard$summary, 3))
#> task_id status rmse__value rmse__n_obs bias__value coverage__mean
#> 1 d001_f001_r00001 success 2.160240 100 0.42322202 1
#> 2 d001_f001_r00002 success 2.107857 100 0.42394617 1
#> 3 d001_f001_r00003 success 2.401193 100 0.09284687 1
#> coverage__by_param__intercept coverage__by_param__slope
#> 1 1 1
#> 2 1 1
#> 3 1 1
#> coverage__by_param__sigma rhat_max ess_bulk ess_tail divergent max_treedepth
#> 1 1 1.01 400 350 0 0
#> 2 1 1.01 400 350 0 0
#> 3 1 1.01 400 350 0 0
#> timing_total rep_idx data_n data_intercept data_slope data_sigma fit_model
#> 1 0.005848169 1 100 1 2 1 linear
#> 2 0.005661011 2 100 1 2 1 linear
#> 3 0.005575418 3 100 1 2 1 linearBoth profiles produce the same summary tibble with metric values. The difference is in what additional data is available in the task results:
# Check what's available in task results (if any)
cat("Minimal retention - task results available:",
!is.null(result_minimal$task_results), "\n")
#> Minimal retention - task results available: TRUE
cat("Standard retention - task results available:",
!is.null(result_standard$task_results), "\n")
#> Standard retention - task results available: TRUEMemory Estimation
Use object.size() and estimate_size() to
check memory usage:
# Estimate size of the result objects
cat("Minimal result size:",
format(object.size(result_minimal), units = "KB"), "\n")
#> Minimal result size: 70.8 Kb
cat("Standard result size:",
format(object.size(result_standard), units = "KB"), "\n")
#> Standard result size: 89.8 Kb
# Estimate size of individual components
if (!is.null(result_standard$task_results) && length(result_standard$task_results) > 0) {
first_task <- result_standard$task_results[[1]]
cat("\nSingle task result size:",
format(estimate_size(first_task), units = "KB"), "\n")
}
#>
#> Single task result size: 3752Tips for Memory Management
Monitor Memory with gc() and
object.size()
Regularly check memory usage during development:
# Before simulation
gc()
start_mem <- sum(gc()[, 2]) # Get used memory in MB
# Run simulation
result <- run_simulation(config, progress = FALSE)
# After simulation
gc()
end_mem <- sum(gc()[, 2])
cat("Memory used:", end_mem - start_mem, "MB\n")
# Check result size
result_size <- object.size(result)
cat("Result object size:", format(result_size, units = "MB"), "\n")Use Test Runs to Estimate Memory Needs
Before running a large study, do a small test run to estimate per-task memory:
# Test run with 5 tasks
test_config <- simulation_config(
data_grid = data.frame(n = 100, sigma = 1),
fit_grid = data.frame(model = "linear"),
data_generator = my_data_generator,
fitter = my_fitter,
metrics = list(rmse_metric()),
n_replicates = 5L,
seed = 42L,
retain = resolve_retention("debug") # Test with full retention
)
test_result <- run_simulation(test_config, progress = FALSE)
# Estimate per-task memory
if (!is.null(test_result$task_results) && length(test_result$task_results) > 0) {
per_task_size <- estimate_size(test_result$task_results[[1]])
total_tasks <- 10000 # Your planned study size
estimated_total <- per_task_size * total_tasks
cat("Per-task size:", format(per_task_size, units = "KB"), "\n")
cat("Estimated total for", total_tasks, "tasks:",
format(estimated_total, units = "GB"), "\n")
}Consider Splitting Very Large Studies
If your study is too large for available memory even with minimal retention, split it into chunks:
# Split by data configuration
n_values <- c(100, 500, 1000)
for (n_val in n_values) {
chunk_config <- simulation_config(
data_grid = data.frame(n = n_val, sigma = c(0.5, 1, 2)),
fit_grid = data.frame(model = c("m1", "m2")),
data_generator = my_data_generator,
fitter = my_fitter,
metrics = list(rmse_metric()),
n_replicates = 100L,
seed = 42L,
retain = resolve_retention("minimal"),
result_path = paste0("chunk_n", n_val),
checkpoint_every = 50L
)
result <- run_simulation(chunk_config, resume = "auto")
# Save chunk summary
saveRDS(result$summary, paste0("summary_n", n_val, ".rds"))
# Explicit cleanup
rm(result)
gc()
}
# Combine summaries later
all_summaries <- lapply(n_values, function(n) {
readRDS(paste0("summary_n", n, ".rds"))
})
combined_summary <- do.call(rbind, all_summaries)Use Checkpointing Strategically
Set checkpoint_every based on your model fitting time
and available memory:
# Fast models: checkpoint less frequently
fast_config <- simulation_config(
# ... other params ...
checkpoint_every = 500L # Checkpoint every 500 tasks
)
# Slow models: checkpoint more frequently to avoid losing work
slow_config <- simulation_config(
# ... other params ...
checkpoint_every = 10L # Checkpoint every 10 tasks
)Clean Up Old Checkpoints
After a simulation completes, you may want to remove old checkpoints to free disk space:
# List checkpoints
checkpoint_ids <- list_checkpoints("my_simulation_results")
cat("Available checkpoints:", checkpoint_ids, "\n")
# Clean old checkpoints, keeping only the 3 most recent
clean_old_checkpoints("my_simulation_results", keep_n = 3)Summary
This vignette covered bayesim’s memory management system:
-
Retention profiles control which artifacts are kept:
-
"minimal": Only metrics (smallest memory) -
"standard": Metrics + diagnostics + warnings (default, balanced) -
"debug": Everything (largest memory, for debugging)
-
Custom retention allows fine-grained control with explicit character vectors
-
Study size guidelines:
- Small (< 100 tasks): Use
"debug"or"standard" - Medium (100-1000 tasks): Use
"standard" - Large (> 1000 tasks): Use
"minimal"or custom
- Small (< 100 tasks): Use
-
Memory-bounded checkpointing keeps memory usage bounded:
- Results are checkpointed to disk periodically
- Heavy objects are cleared from memory after checkpointing
-
chunk_sizeparameter controls when memory is freed
Monitoring tools like
object.size(),estimate_size(), andgc()help you understand and plan for memory needs
Recommendations: - Start with "standard" retention for
development - Switch to "minimal" for runs with many tasks
- Enable checkpointing for studies > 1000 tasks - Use
chunk_size to control memory bounds independently from
checkpoint frequency - Test with small runs to estimate memory needs
before scaling up
For more information on checkpointing and resume functionality, see
vignette("getting-started"). For creating custom metrics
that work efficiently with retention profiles, see
vignette("custom-fitters").