datafake-rs
A high-performance mock JSON data generation library for Rust.
Uses JSONLogic for flexible and powerful fake data generation.
datafake-rs is designed for testing, prototyping, and development scenarios where you need realistic mock data.
Features
- JSONLogic Integration - Use powerful JSONLogic expressions to define your data schema
- 50+ Fake Data Types - Generate names, addresses, emails, phone numbers, financial data, and more
- Variables - Pre-generate values and reuse them across your schema
- Batch Generation - Generate multiple records efficiently
- WebAssembly Support - Use in browsers and Node.js via WASM
Quick Example
Here’s a simple example that generates a user profile:
How It Works
- Define a Configuration - Create a JSON configuration with metadata, variables, and schema
- Use the
fakeOperator - Specify which fake data types to generate using the{"fake": ["type"]}syntax - Generate Data - The library evaluates the schema and produces realistic fake data
Configuration Structure
A datafake configuration has three optional sections:
{
"metadata": {
"name": "User Generator",
"version": "1.0.0"
},
"variables": {
"userId": {"fake": ["uuid"]}
},
"schema": {
"id": {"var": "userId"},
"name": {"fake": ["name"]},
"email": {"fake": ["email"]}
}
}
- metadata - Optional information about the configuration
- variables - Pre-generate values that can be referenced in the schema
- schema - The structure of the output data with fake data operators
Next Steps
- Try the Playground to experiment with configurations
- Read Quick Start for usage examples
- Browse Fake Data Types to see all available generators
Playground
Generate fake data right in your browser! This playground uses the same WebAssembly-compiled engine that powers the Rust library.
How to Use
- Configuration - Enter your datafake configuration in the editor
- Count - Set the number of records to generate (1-100)
- Generate - Press the Generate button or use Ctrl+Enter (Cmd+Enter on Mac)
- Load Examples - Use the dropdown to load pre-built example configurations
Configuration Structure
| Field | Description |
|---|---|
metadata | Optional name/version info |
variables | Pre-generate values to reuse |
schema | Define the output structure |
The fake Operator
The fake operator is the core of datafake-rs. Use it to generate various types of fake data:
{"fake": ["type"]}
{"fake": ["type", arg1, arg2]}
Common Examples
| Type | Syntax | Description |
|---|---|---|
| UUID | {"fake": ["uuid"]} | UUID v4 identifier |
| Name | {"fake": ["name"]} | Full name |
{"fake": ["email"]} | Email address | |
| Integer | {"fake": ["u8", 18, 65]} | Integer with range |
| Float | {"fake": ["f64", 0.0, 100.0]} | Float with range |
| Choice | {"fake": ["enum", "A", "B", "C"]} | Random selection |
Quick Reference
Personal Data
name,first_name,last_name,titleemail,phone_number,username
Address
street_address,city,state_name,country_namezip_code,latitude,longitude
Internet
ipv4,ipv6,mac_address,domain_nameuser_agent,password
Finance
bic,bic8,bic11,ibancredit_card_number,currency_code
Content
word,words,sentence,paragraphuuid,bool
See Fake Data Types for the complete list.
Installation
Rust
Add datafake-rs to your Cargo.toml:
[dependencies]
datafake-rs = "0.2"
Or use cargo add:
cargo add datafake-rs
WebAssembly (Browser/Node.js)
From npm
npm install datafake-wasm
From Source
Build the WASM package yourself:
# Install wasm-pack if not already installed
curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
# Clone and build
git clone https://github.com/GoPlasmatic/datafake-rs.git
cd datafake-rs/datafake-wasm
wasm-pack build --target web
Feature Flags
The library has minimal dependencies by default. No feature flags are required for basic usage.
Minimum Rust Version
datafake-rs requires Rust 2024 edition (Rust 1.85+).
Quick Start
Rust Usage
Basic Generation
use datafake_rs::DataGenerator;
fn main() -> datafake_rs::Result<()> {
let config = r#"{
"schema": {
"id": {"fake": ["uuid"]},
"name": {"fake": ["name"]},
"email": {"fake": ["email"]}
}
}"#;
let generator = DataGenerator::from_json(config)?;
let result = generator.generate()?;
println!("{}", serde_json::to_string_pretty(&result)?);
Ok(())
}
Batch Generation
Generate multiple records efficiently:
use datafake_rs::DataGenerator;
fn main() -> datafake_rs::Result<()> {
let config = r#"{
"schema": {
"id": {"fake": ["uuid"]},
"name": {"fake": ["name"]}
}
}"#;
let generator = DataGenerator::from_json(config)?;
let results = generator.generate_batch(100)?;
println!("Generated {} records", results.len());
Ok(())
}
JavaScript/TypeScript Usage
Browser (ES Modules)
import init, { generate, FakeDataGenerator } from 'datafake-wasm';
async function main() {
await init();
// One-off generation
const config = JSON.stringify({
schema: {
id: { fake: ["uuid"] },
name: { fake: ["name"] },
email: { fake: ["email"] }
}
});
const result = JSON.parse(generate(config));
console.log(result);
}
main();
Reusable Generator
For generating multiple records, use the FakeDataGenerator class:
import init, { FakeDataGenerator } from 'datafake-wasm';
async function main() {
await init();
const config = JSON.stringify({
schema: {
id: { fake: ["uuid"] },
name: { fake: ["name"] }
}
});
const gen = new FakeDataGenerator(config);
// Generate single records
const user1 = JSON.parse(gen.generate());
const user2 = JSON.parse(gen.generate());
// Generate batch
const users = JSON.parse(gen.generate_batch(10));
// Clean up when done
gen.free();
}
main();
Node.js
const { generate, FakeDataGenerator } = require('datafake-wasm');
const config = JSON.stringify({
schema: {
id: { fake: ["uuid"] },
name: { fake: ["name"] }
}
});
const result = JSON.parse(generate(config));
console.log(result);
Try It Now
Basic Concepts
Configuration Structure
A datafake configuration is a JSON object with three optional sections:
{
"metadata": { ... },
"variables": { ... },
"schema": { ... }
}
Metadata
Optional information about the configuration:
{
"metadata": {
"name": "User Generator",
"version": "1.0.0",
"description": "Generates fake user data"
}
}
Variables
Pre-generate values that can be reused across the schema:
{
"variables": {
"userId": {"fake": ["uuid"]},
"createdAt": {"fake": ["datetime"]}
},
"schema": {
"id": {"var": "userId"},
"audit": {
"createdBy": {"var": "userId"},
"createdAt": {"var": "createdAt"}
}
}
}
Schema
The schema defines the structure of the generated output. It uses JSONLogic operators, with the fake operator being the most important.
The fake Operator
The fake operator generates fake data of a specified type:
{"fake": ["type"]}
{"fake": ["type", arg1, arg2]}
Simple Types
{"fake": ["uuid"]}
{"fake": ["name"]}
{"fake": ["email"]}
Types with Arguments
Some types accept arguments for customization:
{"fake": ["u8", 18, 65]} // Integer between 18 and 65
{"fake": ["password", 8, 16]} // Password with 8-16 characters
{"fake": ["words", 5]} // 5 random words
JSONLogic Operators
datafake-rs supports all standard JSONLogic operators. Here are some commonly used ones:
Variable Access
Reference variables with var:
{"var": "userId"}
{"var": "user.name"}
String Concatenation
Combine strings with cat:
Conditional Logic
Use if for conditional generation:
Arrays
Generate arrays with map:
{
"schema": {
"tags": {"map": [
[1, 2, 3],
{"fake": ["word"]}
]}
}
}
Nested Objects
Create complex nested structures:
Error Handling
Invalid configurations will result in clear error messages:
- ConfigParse - Invalid JSON syntax
- InvalidConfig - Missing required fields
- FakeOperatorError - Unknown fake type or invalid arguments
- VariableNotFound - Referenced variable doesn’t exist
Fake Data Types Overview
datafake-rs supports over 50 different fake data types organized into categories. Each type is invoked using the fake operator:
{"fake": ["type_name"]}
{"fake": ["type_name", arg1, arg2]}
Categories
| Category | Types | Description |
|---|---|---|
| Numeric | 10 | Integers and floating-point numbers with optional ranges |
| Personal | 7 | Names, titles, and personal information |
| Address | 12 | Street addresses, cities, countries, coordinates |
| Internet | 11 | Emails, usernames, IPs, domains |
| Company | 9 | Company names, industries, buzzwords |
| Finance | 10 | BIC, IBAN, credit cards, currencies |
| Content | 5 | Words, sentences, paragraphs, UUIDs |
| Date & Time | 5 | Dates, times, timestamps |
| Other | 8 | Booleans, files, barcodes, custom choices |
Quick Reference
Most Common Types
All Types by Category
Numeric
u8, u16, u32, u64, i8, i16, i32, i64, f32, f64
Personal
name, full_name, first_name, last_name, name_with_title, title, suffix
Address
street_address, street_name, street_suffix, city, city_name, state_name, state_abbr, country_name, country_code, zip_code, post_code, latitude, longitude
Internet
email, safe_email, free_email, username, password, domain_name, domain_suffix, ipv4, ipv6, mac_address, user_agent
Company
company_name, company_suffix, industry, profession, catch_phrase, bs, bs_adj, bs_noun, bs_verb
Finance
bic, bic8, bic11, credit_card_number, currency_code, currency_name, currency_symbol, iban, lei, alphanumeric
Content
uuid, word, words, sentence, paragraph
Date & Time
datetime, iso8601_datetime, date, time, month_name
Other
bool, boolean, enum, pick, choice, regex, file_name, file_extension, file_path, dir_path, isbn10, isbn13
Numeric Types
Generate random numbers with optional range constraints.
Integers
Unsigned Integers
u8 (0 to 255)
{"fake": ["u8"]}
{"fake": ["u8", min, max]}
u16 (0 to 65,535)
{"fake": ["u16"]}
{"fake": ["u16", min, max]}
u32 (0 to 4,294,967,295)
{"fake": ["u32"]}
{"fake": ["u32", min, max]}
u64 (0 to 18,446,744,073,709,551,615)
{"fake": ["u64"]}
{"fake": ["u64", min, max]}
Signed Integers
i8 (-128 to 127)
{"fake": ["i8"]}
{"fake": ["i8", min, max]}
i16 (-32,768 to 32,767)
{"fake": ["i16"]}
{"fake": ["i16", min, max]}
i32 (-2,147,483,648 to 2,147,483,647)
{"fake": ["i32"]}
{"fake": ["i32", min, max]}
i64
{"fake": ["i64"]}
{"fake": ["i64", min, max]}
Floating Point
f32 (32-bit float)
{"fake": ["f32"]}
{"fake": ["f32", min, max]}
f64 (64-bit float)
{"fake": ["f64"]}
{"fake": ["f64", min, max]}
Usage Notes
- When using ranges, both
minandmaxare inclusive - If no range is specified, the full range of the type is used
- Float types may produce very large or very small numbers without constraints
Personal Data
Generate realistic personal information.
Names
name / full_name
Generates a complete name:
{"fake": ["name"]}
{"fake": ["full_name"]}
first_name
Generates a first name:
{"fake": ["first_name"]}
last_name
Generates a last name:
{"fake": ["last_name"]}
name_with_title
Generates a name with a professional title:
{"fake": ["name_with_title"]}
title
Generates a title/honorific (Mr., Mrs., Dr., etc.):
{"fake": ["title"]}
suffix
Generates a name suffix (Jr., Sr., III, etc.):
{"fake": ["suffix"]}
Complete User Profile
Combine personal data types for a complete profile:
Phone Numbers
phone_number
Generates a phone number:
{"fake": ["phone_number"]}
cell_number
Generates a cell phone number:
{"fake": ["cell_number"]}
Address Data
Generate realistic address components and geographic data.
Street Address
street_address
Generates a full street address:
{"fake": ["street_address"]}
street_name
Generates just the street name:
{"fake": ["street_name"]}
street_suffix
Generates a street suffix (Street, Avenue, Lane, etc.):
{"fake": ["street_suffix"]}
City and State
city / city_name
Generates a city name:
{"fake": ["city"]}
{"fake": ["city_name"]}
state_name
Generates a US state name:
{"fake": ["state_name"]}
state_abbr
Generates a US state abbreviation:
{"fake": ["state_abbr"]}
Country
country_name
Generates a country name:
{"fake": ["country_name"]}
country_code
Generates a country code:
{"fake": ["country_code"]}
Postal Codes
zip_code / zip
Generates a US ZIP code:
{"fake": ["zip_code"]}
{"fake": ["zip"]}
post_code / postcode / postal_code
Generates a postal code:
{"fake": ["post_code"]}
{"fake": ["postcode"]}
{"fake": ["postal_code"]}
Geographic Coordinates
latitude
Generates a latitude value (-90 to 90):
{"fake": ["latitude"]}
longitude
Generates a longitude value (-180 to 180):
{"fake": ["longitude"]}
Complete Address
Internet Data
Generate internet-related fake data like emails, usernames, IPs, and more.
Email Addresses
email / safe_email
Generates a safe email address (uses example.com domains):
{"fake": ["email"]}
{"fake": ["safe_email"]}
free_email
Generates an email from free email providers:
{"fake": ["free_email"]}
Usernames and Passwords
username
Generates a username:
{"fake": ["username"]}
password
Generates a password with configurable length:
{"fake": ["password"]}
{"fake": ["password", min_length, max_length]}
Domains
domain_name
Generates a domain name:
{"fake": ["domain_name"]}
domain_suffix
Generates a domain suffix (com, org, net, etc.):
{"fake": ["domain_suffix"]}
IP Addresses
ipv4
Generates an IPv4 address:
{"fake": ["ipv4"]}
ipv6
Generates an IPv6 address:
{"fake": ["ipv6"]}
Network
mac_address
Generates a MAC address:
{"fake": ["mac_address"]}
user_agent
Generates a browser user agent string:
{"fake": ["user_agent"]}
Complete Network Profile
Company Data
Generate business-related fake data.
Company Names
company_name
Generates a company name:
{"fake": ["company_name"]}
company_suffix
Generates a company suffix (Inc., LLC, Corp., etc.):
{"fake": ["company_suffix"]}
Industry and Profession
industry
Generates an industry name:
{"fake": ["industry"]}
profession
Generates a profession/job title:
{"fake": ["profession"]}
Marketing Buzzwords
catch_phrase
Generates a company catch phrase:
{"fake": ["catch_phrase"]}
bs (Business Speak)
Generates business jargon:
{"fake": ["bs"]}
bs_adj
Generates a business adjective:
{"fake": ["bs_adj"]}
bs_noun
Generates a business noun:
{"fake": ["bs_noun"]}
bs_verb
Generates a business verb:
{"fake": ["bs_verb"]}
Complete Company Profile
Finance Data
Generate financial data for testing payment systems and banking applications.
Bank Identifiers
bic
Generates a BIC (Bank Identifier Code), randomly 8 or 11 characters:
{"fake": ["bic"]}
{"fake": ["bic", 8]} // Force 8 characters
{"fake": ["bic", 11]} // Force 11 characters
bic8
Generates an 8-character BIC:
{"fake": ["bic8"]}
bic11
Generates an 11-character BIC (with branch code):
{"fake": ["bic11"]}
iban
Generates an IBAN with optional country code:
{"fake": ["iban"]}
{"fake": ["iban", "DE"]} // German IBAN
{"fake": ["iban", "FR"]} // French IBAN
lei
Generates a Legal Entity Identifier:
{"fake": ["lei"]}
Credit Cards
credit_card_number
Generates a credit card number:
{"fake": ["credit_card_number"]}
Currency
currency_code
Generates a currency code (USD, EUR, etc.):
{"fake": ["currency_code"]}
currency_name
Generates a currency name:
{"fake": ["currency_name"]}
currency_symbol
Generates a currency symbol:
{"fake": ["currency_symbol"]}
Alphanumeric
alphanumeric
Generates an alphanumeric string (useful for reference numbers):
{"fake": ["alphanumeric", length]}
{"fake": ["alphanumeric", min_length, max_length]}
Complete Transaction
Content Data
Generate text content, identifiers, and lorem ipsum text.
Identifiers
uuid
Generates a UUID v4:
{"fake": ["uuid"]}
Lorem Ipsum
word
Generates a single random word:
{"fake": ["word"]}
words
Generates multiple words (default 5):
{"fake": ["words"]}
{"fake": ["words", count]}
sentence
Generates a sentence with configurable word count:
{"fake": ["sentence"]}
{"fake": ["sentence", min_words, max_words]}
paragraph
Generates a paragraph with configurable sentence count:
{"fake": ["paragraph"]}
{"fake": ["paragraph", min_sentences, max_sentences]}
Content Generation Examples
Blog Post
Product Description
Comments
Date & Time Data
Generate dates, times, and timestamps.
Timestamps
datetime / iso8601_datetime
Generates an ISO 8601 formatted datetime:
{"fake": ["datetime"]}
{"fake": ["iso8601_datetime"]}
Date
date
Generates a date with optional format:
{"fake": ["date"]}
{"fake": ["date", format]}
Default format is %Y-%m-%d.
Common Date Formats
| Format | Example |
|---|---|
%Y-%m-%d | 2024-01-15 |
%m/%d/%Y | 01/15/2024 |
%d.%m.%Y | 15.01.2024 |
%B %d, %Y | January 15, 2024 |
%Y%m%d | 20240115 |
Time
time
Generates a time in HH:MM:SS format:
{"fake": ["time"]}
Month
month_name
Generates a random month name:
{"fake": ["month_name"]}
Complete Examples
Event
Audit Log
Schedule
Other Data Types
Additional fake data types including booleans, files, barcodes, and custom choices.
Boolean
bool / boolean
Generates a random boolean:
{"fake": ["bool"]}
{"fake": ["boolean"]}
Custom Choices
enum / pick / choice
Pick a random value from provided options:
{"fake": ["enum", "option1", "option2", "option3"]}
{"fake": ["pick", "option1", "option2", "option3"]}
{"fake": ["choice", "option1", "option2", "option3"]}
regex
Generate from simple alternation patterns:
{"fake": ["regex", "(A|B|C)"]}
Note: Only supports simple alternation patterns like
(A|B|C). Complex regex patterns are not supported.
File System
file_name
Generates a file name with extension:
{"fake": ["file_name"]}
file_extension
Generates a file extension:
{"fake": ["file_extension"]}
file_path
Generates a file path:
{"fake": ["file_path"]}
dir_path
Generates a directory path:
{"fake": ["dir_path"]}
Barcodes
isbn10
Generates an ISBN-10:
{"fake": ["isbn10"]}
isbn13
Generates an ISBN-13:
{"fake": ["isbn13"]}
Complete Examples
Feature Flags
E-commerce Product
Library Book
Variable System
Variables allow you to pre-generate values and reuse them across your schema. This is useful when you need the same value in multiple places or want to create relationships between fields.
Defining Variables
Variables are defined in the variables section of your configuration:
{
"variables": {
"userId": {"fake": ["uuid"]},
"createdAt": {"fake": ["datetime"]}
},
"schema": {
...
}
}
Each variable is evaluated once when generation begins, and the result is cached for reuse.
Referencing Variables
Use the var operator to reference a variable:
{"var": "variableName"}
Basic Example
In this example, userId appears in both id and audit.createdBy with the same value, and timestamp is used for both createdAt and updatedAt.
Use Cases
Consistent IDs Across Related Objects
Generating Related Data
Timestamps for Audit Trails
Variable Scope
Variables are evaluated in the order they are defined, and each variable can only reference variables defined before it.
{
"variables": {
"firstName": {"fake": ["first_name"]},
"lastName": {"fake": ["last_name"]},
"fullName": {"cat": [{"var": "firstName"}, " ", {"var": "lastName"}]}
},
"schema": {
"name": {"var": "fullName"},
"firstName": {"var": "firstName"},
"lastName": {"var": "lastName"}
}
}
Variables vs Schema Values
| Feature | Variables | Schema Values |
|---|---|---|
| Evaluated | Once at start | For each generation |
| Reusable | Yes, via var | No |
| Order-dependent | Yes | No |
| Best for | Shared values, relationships | Unique values per field |
Tips
- Use variables for shared IDs - When multiple fields need the same identifier
- Use variables for timestamps - When created/updated times should match
- Use variables for names - When you need to derive email from name
- Keep variables simple - Complex expressions are better in the schema
JSONLogic Integration
datafake-rs is built on top of datalogic-rs, a high-performance JSONLogic implementation. This means you can use all standard JSONLogic operators alongside the fake operator.
Core Operators
Variable Access
Use var to access variables or nested data:
{"var": "variableName"}
{"var": "nested.path.to.value"}
String Concatenation
Use cat to combine strings:
{"cat": ["Hello, ", {"fake": ["name"]}, "!"]}
Conditional Logic
Use if for conditional generation:
{"if": [condition, then_value, else_value]}
Comparison Operators
| Operator | Description | Example |
|---|---|---|
== | Equal | {"==": [1, 1]} |
!= | Not equal | {"!=": [1, 2]} |
> | Greater than | {">": [5, 3]} |
>= | Greater or equal | {">=": [5, 5]} |
< | Less than | {"<": [3, 5]} |
<= | Less or equal | {"<=": [5, 5]} |
Logical Operators
and
All conditions must be true:
{"and": [condition1, condition2, ...]}
or
At least one condition must be true:
{"or": [condition1, condition2, ...]}
not / !
Negates a condition:
{"!": [condition]}
{"not": [condition]}
Arithmetic Operators
| Operator | Description | Example |
|---|---|---|
+ | Add | {"+": [1, 2, 3]} → 6 |
- | Subtract | {"-": [10, 3]} → 7 |
* | Multiply | {"*": [2, 3, 4]} → 24 |
/ | Divide | {"/": [10, 2]} → 5 |
% | Modulo | {"%": [10, 3]} → 1 |
Array Operators
map
Transform each element:
{"map": [array, transformation]}
filter
Filter elements by condition:
{"filter": [array, condition]}
reduce
Reduce array to single value:
{"reduce": [array, reducer, initial_value]}
merge
Combine arrays:
{"merge": [array1, array2]}
String Operators
cat
Concatenate strings:
{"cat": ["string1", "string2", ...]}
substr
Extract substring:
{"substr": ["string", start, length]}
Combining with fake
You can combine any JSONLogic operator with fake:
Complex Example
Learn More
For complete JSONLogic documentation, see:
Batch Generation
When you need to generate multiple records, batch generation is more efficient than calling generate multiple times.
Rust API
Basic Batch Generation
use datafake_rs::DataGenerator;
fn main() -> datafake_rs::Result<()> {
let config = r#"{
"schema": {
"id": {"fake": ["uuid"]},
"name": {"fake": ["name"]},
"email": {"fake": ["email"]}
}
}"#;
let generator = DataGenerator::from_json(config)?;
// Generate 100 records
let users = generator.generate_batch(100)?;
println!("Generated {} users", users.len());
for user in &users {
println!("{}", user);
}
Ok(())
}
Reusing the Generator
The DataGenerator can be reused for multiple batch generations:
#![allow(unused)]
fn main() {
let generator = DataGenerator::from_json(config)?;
// Generate different batches
let batch1 = generator.generate_batch(10)?;
let batch2 = generator.generate_batch(20)?;
let batch3 = generator.generate_batch(50)?;
}
JavaScript/TypeScript API
Using FakeDataGenerator
import init, { FakeDataGenerator } from 'datafake-wasm';
async function main() {
await init();
const config = JSON.stringify({
schema: {
id: { fake: ["uuid"] },
name: { fake: ["name"] },
email: { fake: ["email"] }
}
});
const gen = new FakeDataGenerator(config);
// Generate 100 records
const users = JSON.parse(gen.generate_batch(100));
console.log(`Generated ${users.length} users`);
// Clean up when done
gen.free();
}
Memory Management
When using FakeDataGenerator in JavaScript, always call free() when you’re done to release WASM memory:
const gen = new FakeDataGenerator(config);
try {
const batch = gen.generate_batch(1000);
// Process batch...
} finally {
gen.free();
}
Try It
Use the Count field to generate multiple records:
Performance Tips
1. Reuse the Generator
Creating a generator parses and validates the configuration. Reuse the same generator for multiple batches:
#![allow(unused)]
fn main() {
// Good - parse once, generate many times
let generator = DataGenerator::from_json(config)?;
for _ in 0..100 {
let batch = generator.generate_batch(1000)?;
}
// Bad - parsing on every iteration
for _ in 0..100 {
let generator = DataGenerator::from_json(config)?;
let batch = generator.generate_batch(1000)?;
}
}
2. Use Appropriate Batch Sizes
Larger batches are more efficient due to reduced function call overhead:
#![allow(unused)]
fn main() {
// Better - one batch of 10,000
let batch = generator.generate_batch(10000)?;
// Worse - 100 batches of 100
for _ in 0..100 {
let batch = generator.generate_batch(100)?;
}
}
3. Minimize Complex Expressions
Simple schemas generate faster than complex nested expressions:
// Fast
{"schema": {"id": {"fake": ["uuid"]}, "name": {"fake": ["name"]}}}
// Slower (many nested operations)
{"schema": {"data": {"if": [{"==": [{"fake": ["bool"]}, true]}, ...]}}}
Streaming Large Batches
For very large datasets, consider generating in chunks to manage memory:
#![allow(unused)]
fn main() {
let generator = DataGenerator::from_json(config)?;
// Generate 1 million records in chunks
let chunk_size = 10000;
let total_records = 1_000_000;
for chunk in 0..(total_records / chunk_size) {
let batch = generator.generate_batch(chunk_size)?;
// Process or write batch to file/database
process_batch(&batch)?;
}
}
Output Format
Batch generation returns a JSON array:
[
{"id": "...", "name": "Alice Smith", "email": "alice@example.com"},
{"id": "...", "name": "Bob Jones", "email": "bob@example.com"},
{"id": "...", "name": "Carol Brown", "email": "carol@example.com"}
]
Each record in the batch has independently generated values - no two records share fake data (unless using variables, which are regenerated for each record in the batch).
API Reference
Rust API
DataGenerator
The main entry point for generating fake data.
#![allow(unused)]
fn main() {
use datafake_rs::DataGenerator;
}
Constructors
from_json
Create a generator from a JSON string:
#![allow(unused)]
fn main() {
pub fn from_json(json_str: &str) -> Result<Self>
}
Example:
#![allow(unused)]
fn main() {
let config = r#"{"schema": {"id": {"fake": ["uuid"]}}}"#;
let generator = DataGenerator::from_json(config)?;
}
from_value
Create a generator from a serde_json::Value:
#![allow(unused)]
fn main() {
pub fn from_value(json_value: Value) -> Result<Self>
}
Example:
#![allow(unused)]
fn main() {
use serde_json::json;
let config = json!({
"schema": {
"id": {"fake": ["uuid"]}
}
});
let generator = DataGenerator::from_value(config)?;
}
new
Create a generator from a DataFakeConfig:
#![allow(unused)]
fn main() {
pub fn new(config: DataFakeConfig) -> Self
}
Methods
generate
Generate a single record:
#![allow(unused)]
fn main() {
pub fn generate(&self) -> Result<Value>
}
Example:
#![allow(unused)]
fn main() {
let result = generator.generate()?;
println!("{}", serde_json::to_string_pretty(&result)?);
}
generate_batch
Generate multiple records:
#![allow(unused)]
fn main() {
pub fn generate_batch(&self, count: usize) -> Result<Vec<Value>>
}
Example:
#![allow(unused)]
fn main() {
let results = generator.generate_batch(100)?;
for result in results {
println!("{}", result);
}
}
DataFakeConfig
Configuration structure for the generator.
#![allow(unused)]
fn main() {
pub struct DataFakeConfig {
pub metadata: Option<Metadata>,
pub variables: HashMap<String, Value>,
pub schema: Value,
}
}
Metadata
Optional metadata for the configuration.
#![allow(unused)]
fn main() {
pub struct Metadata {
pub name: Option<String>,
pub version: Option<String>,
pub description: Option<String>,
pub extra: HashMap<String, Value>,
}
}
Error Types
#![allow(unused)]
fn main() {
pub enum DataFakeError {
ConfigParse(String),
InvalidConfig(String),
VariableNotFound(String),
JsonError(serde_json::Error),
FakeOperatorError(String),
TypeConversion(String),
InvalidLocale(String),
InvalidRange(String),
}
}
| Error | Description |
|---|---|
ConfigParse | Failed to parse JSON configuration |
InvalidConfig | Configuration is missing required fields |
VariableNotFound | Referenced variable doesn’t exist |
JsonError | JSON serialization/deserialization error |
FakeOperatorError | Invalid fake operator or arguments |
TypeConversion | Type conversion failed |
InvalidLocale | Invalid locale specified |
InvalidRange | Invalid numeric range |
WebAssembly API
generate
One-off generation from a configuration string:
function generate(config: string): string
Parameters:
config- JSON string containing the datafake configuration
Returns:
- JSON string of the generated data
Throws:
- Error if configuration is invalid
Example:
import init, { generate } from 'datafake-wasm';
await init();
const config = JSON.stringify({
schema: { id: { fake: ["uuid"] } }
});
const result = JSON.parse(generate(config));
FakeDataGenerator
Reusable generator class for multiple generations.
Constructor
new FakeDataGenerator(config: string)
Parameters:
config- JSON string containing the datafake configuration
Throws:
- Error if configuration is invalid
Methods
generate
Generate a single record:
generate(): string
Returns:
- JSON string of the generated data
generate_batch
Generate multiple records:
generate_batch(count: number): string
Parameters:
count- Number of records to generate
Returns:
- JSON string of an array containing the generated records
free
Release WASM memory:
free(): void
Important: Always call free() when done to prevent memory leaks.
Example:
import init, { FakeDataGenerator } from 'datafake-wasm';
await init();
const config = JSON.stringify({
schema: {
id: { fake: ["uuid"] },
name: { fake: ["name"] }
}
});
const gen = new FakeDataGenerator(config);
try {
const single = JSON.parse(gen.generate());
const batch = JSON.parse(gen.generate_batch(10));
} finally {
gen.free();
}
Configuration Schema
Full Configuration
{
"metadata": {
"name": "string (optional)",
"version": "string (optional)",
"description": "string (optional)",
"...": "any additional fields"
},
"variables": {
"variableName": "<JSONLogic expression>"
},
"schema": "<JSONLogic expression>"
}
fake Operator
{"fake": ["type"]}
{"fake": ["type", arg1, arg2, ...]}
See Fake Data Types for all available types.
var Operator
Reference a variable:
{"var": "variableName"}
{"var": "nested.path"}
Complete Example
use datafake_rs::{DataGenerator, Result};
fn main() -> Result<()> {
let config = r#"{
"metadata": {
"name": "User Generator",
"version": "1.0.0"
},
"variables": {
"userId": {"fake": ["uuid"]},
"createdAt": {"fake": ["datetime"]}
},
"schema": {
"id": {"var": "userId"},
"profile": {
"name": {"fake": ["name"]},
"email": {"fake": ["email"]},
"age": {"fake": ["u8", 18, 65]}
},
"metadata": {
"createdAt": {"var": "createdAt"},
"createdBy": {"var": "userId"}
}
}
}"#;
let generator = DataGenerator::from_json(config)?;
// Generate single record
let user = generator.generate()?;
println!("Single user: {}", serde_json::to_string_pretty(&user)?);
// Generate batch
let users = generator.generate_batch(10)?;
println!("Generated {} users", users.len());
Ok(())
}
Links
- docs.rs/datafake-rs - Full Rust API documentation
- GitHub Repository
- npm Package