Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Plasmatic Logo

datafake-rs

A high-performance mock JSON data generation library for Rust.

Uses JSONLogic for flexible and powerful fake data generation.

License Rust Crates.io

OrganizationDocsReport a Bug


datafake-rs is designed for testing, prototyping, and development scenarios where you need realistic mock data.

Features

  • JSONLogic Integration - Use powerful JSONLogic expressions to define your data schema
  • 50+ Fake Data Types - Generate names, addresses, emails, phone numbers, financial data, and more
  • Variables - Pre-generate values and reuse them across your schema
  • Batch Generation - Generate multiple records efficiently
  • WebAssembly Support - Use in browsers and Node.js via WASM

Quick Example

Here’s a simple example that generates a user profile:

How It Works

  1. Define a Configuration - Create a JSON configuration with metadata, variables, and schema
  2. Use the fake Operator - Specify which fake data types to generate using the {"fake": ["type"]} syntax
  3. Generate Data - The library evaluates the schema and produces realistic fake data

Configuration Structure

A datafake configuration has three optional sections:

{
    "metadata": {
        "name": "User Generator",
        "version": "1.0.0"
    },
    "variables": {
        "userId": {"fake": ["uuid"]}
    },
    "schema": {
        "id": {"var": "userId"},
        "name": {"fake": ["name"]},
        "email": {"fake": ["email"]}
    }
}
  • metadata - Optional information about the configuration
  • variables - Pre-generate values that can be referenced in the schema
  • schema - The structure of the output data with fake data operators

Next Steps

Playground

Generate fake data right in your browser! This playground uses the same WebAssembly-compiled engine that powers the Rust library.

How to Use

  1. Configuration - Enter your datafake configuration in the editor
  2. Count - Set the number of records to generate (1-100)
  3. Generate - Press the Generate button or use Ctrl+Enter (Cmd+Enter on Mac)
  4. Load Examples - Use the dropdown to load pre-built example configurations

Configuration Structure

FieldDescription
metadataOptional name/version info
variablesPre-generate values to reuse
schemaDefine the output structure

The fake Operator

The fake operator is the core of datafake-rs. Use it to generate various types of fake data:

{"fake": ["type"]}
{"fake": ["type", arg1, arg2]}

Common Examples

TypeSyntaxDescription
UUID{"fake": ["uuid"]}UUID v4 identifier
Name{"fake": ["name"]}Full name
Email{"fake": ["email"]}Email address
Integer{"fake": ["u8", 18, 65]}Integer with range
Float{"fake": ["f64", 0.0, 100.0]}Float with range
Choice{"fake": ["enum", "A", "B", "C"]}Random selection

Quick Reference

Personal Data

  • name, first_name, last_name, title
  • email, phone_number, username

Address

  • street_address, city, state_name, country_name
  • zip_code, latitude, longitude

Internet

  • ipv4, ipv6, mac_address, domain_name
  • user_agent, password

Finance

  • bic, bic8, bic11, iban
  • credit_card_number, currency_code

Content

  • word, words, sentence, paragraph
  • uuid, bool

See Fake Data Types for the complete list.

Installation

Rust

Add datafake-rs to your Cargo.toml:

[dependencies]
datafake-rs = "0.2"

Or use cargo add:

cargo add datafake-rs

WebAssembly (Browser/Node.js)

From npm

npm install datafake-wasm

From Source

Build the WASM package yourself:

# Install wasm-pack if not already installed
curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh

# Clone and build
git clone https://github.com/GoPlasmatic/datafake-rs.git
cd datafake-rs/datafake-wasm
wasm-pack build --target web

Feature Flags

The library has minimal dependencies by default. No feature flags are required for basic usage.

Minimum Rust Version

datafake-rs requires Rust 2024 edition (Rust 1.85+).

Quick Start

Rust Usage

Basic Generation

use datafake_rs::DataGenerator;

fn main() -> datafake_rs::Result<()> {
    let config = r#"{
        "schema": {
            "id": {"fake": ["uuid"]},
            "name": {"fake": ["name"]},
            "email": {"fake": ["email"]}
        }
    }"#;

    let generator = DataGenerator::from_json(config)?;
    let result = generator.generate()?;

    println!("{}", serde_json::to_string_pretty(&result)?);
    Ok(())
}

Batch Generation

Generate multiple records efficiently:

use datafake_rs::DataGenerator;

fn main() -> datafake_rs::Result<()> {
    let config = r#"{
        "schema": {
            "id": {"fake": ["uuid"]},
            "name": {"fake": ["name"]}
        }
    }"#;

    let generator = DataGenerator::from_json(config)?;
    let results = generator.generate_batch(100)?;

    println!("Generated {} records", results.len());
    Ok(())
}

JavaScript/TypeScript Usage

Browser (ES Modules)

import init, { generate, FakeDataGenerator } from 'datafake-wasm';

async function main() {
    await init();

    // One-off generation
    const config = JSON.stringify({
        schema: {
            id: { fake: ["uuid"] },
            name: { fake: ["name"] },
            email: { fake: ["email"] }
        }
    });

    const result = JSON.parse(generate(config));
    console.log(result);
}

main();

Reusable Generator

For generating multiple records, use the FakeDataGenerator class:

import init, { FakeDataGenerator } from 'datafake-wasm';

async function main() {
    await init();

    const config = JSON.stringify({
        schema: {
            id: { fake: ["uuid"] },
            name: { fake: ["name"] }
        }
    });

    const gen = new FakeDataGenerator(config);

    // Generate single records
    const user1 = JSON.parse(gen.generate());
    const user2 = JSON.parse(gen.generate());

    // Generate batch
    const users = JSON.parse(gen.generate_batch(10));

    // Clean up when done
    gen.free();
}

main();

Node.js

const { generate, FakeDataGenerator } = require('datafake-wasm');

const config = JSON.stringify({
    schema: {
        id: { fake: ["uuid"] },
        name: { fake: ["name"] }
    }
});

const result = JSON.parse(generate(config));
console.log(result);

Try It Now

Basic Concepts

Configuration Structure

A datafake configuration is a JSON object with three optional sections:

{
    "metadata": { ... },
    "variables": { ... },
    "schema": { ... }
}

Metadata

Optional information about the configuration:

{
    "metadata": {
        "name": "User Generator",
        "version": "1.0.0",
        "description": "Generates fake user data"
    }
}

Variables

Pre-generate values that can be reused across the schema:

{
    "variables": {
        "userId": {"fake": ["uuid"]},
        "createdAt": {"fake": ["datetime"]}
    },
    "schema": {
        "id": {"var": "userId"},
        "audit": {
            "createdBy": {"var": "userId"},
            "createdAt": {"var": "createdAt"}
        }
    }
}

Schema

The schema defines the structure of the generated output. It uses JSONLogic operators, with the fake operator being the most important.

The fake Operator

The fake operator generates fake data of a specified type:

{"fake": ["type"]}
{"fake": ["type", arg1, arg2]}

Simple Types

{"fake": ["uuid"]}
{"fake": ["name"]}
{"fake": ["email"]}

Types with Arguments

Some types accept arguments for customization:

{"fake": ["u8", 18, 65]}     // Integer between 18 and 65
{"fake": ["password", 8, 16]} // Password with 8-16 characters
{"fake": ["words", 5]}        // 5 random words

JSONLogic Operators

datafake-rs supports all standard JSONLogic operators. Here are some commonly used ones:

Variable Access

Reference variables with var:

{"var": "userId"}
{"var": "user.name"}

String Concatenation

Combine strings with cat:

Conditional Logic

Use if for conditional generation:

Arrays

Generate arrays with map:

{
    "schema": {
        "tags": {"map": [
            [1, 2, 3],
            {"fake": ["word"]}
        ]}
    }
}

Nested Objects

Create complex nested structures:

Error Handling

Invalid configurations will result in clear error messages:

  • ConfigParse - Invalid JSON syntax
  • InvalidConfig - Missing required fields
  • FakeOperatorError - Unknown fake type or invalid arguments
  • VariableNotFound - Referenced variable doesn’t exist

Fake Data Types Overview

datafake-rs supports over 50 different fake data types organized into categories. Each type is invoked using the fake operator:

{"fake": ["type_name"]}
{"fake": ["type_name", arg1, arg2]}

Categories

CategoryTypesDescription
Numeric10Integers and floating-point numbers with optional ranges
Personal7Names, titles, and personal information
Address12Street addresses, cities, countries, coordinates
Internet11Emails, usernames, IPs, domains
Company9Company names, industries, buzzwords
Finance10BIC, IBAN, credit cards, currencies
Content5Words, sentences, paragraphs, UUIDs
Date & Time5Dates, times, timestamps
Other8Booleans, files, barcodes, custom choices

Quick Reference

Most Common Types

All Types by Category

Numeric

u8, u16, u32, u64, i8, i16, i32, i64, f32, f64

Personal

name, full_name, first_name, last_name, name_with_title, title, suffix

Address

street_address, street_name, street_suffix, city, city_name, state_name, state_abbr, country_name, country_code, zip_code, post_code, latitude, longitude

Internet

email, safe_email, free_email, username, password, domain_name, domain_suffix, ipv4, ipv6, mac_address, user_agent

Company

company_name, company_suffix, industry, profession, catch_phrase, bs, bs_adj, bs_noun, bs_verb

Finance

bic, bic8, bic11, credit_card_number, currency_code, currency_name, currency_symbol, iban, lei, alphanumeric

Content

uuid, word, words, sentence, paragraph

Date & Time

datetime, iso8601_datetime, date, time, month_name

Other

bool, boolean, enum, pick, choice, regex, file_name, file_extension, file_path, dir_path, isbn10, isbn13

Numeric Types

Generate random numbers with optional range constraints.

Integers

Unsigned Integers

u8 (0 to 255)

{"fake": ["u8"]}
{"fake": ["u8", min, max]}

u16 (0 to 65,535)

{"fake": ["u16"]}
{"fake": ["u16", min, max]}

u32 (0 to 4,294,967,295)

{"fake": ["u32"]}
{"fake": ["u32", min, max]}

u64 (0 to 18,446,744,073,709,551,615)

{"fake": ["u64"]}
{"fake": ["u64", min, max]}

Signed Integers

i8 (-128 to 127)

{"fake": ["i8"]}
{"fake": ["i8", min, max]}

i16 (-32,768 to 32,767)

{"fake": ["i16"]}
{"fake": ["i16", min, max]}

i32 (-2,147,483,648 to 2,147,483,647)

{"fake": ["i32"]}
{"fake": ["i32", min, max]}

i64

{"fake": ["i64"]}
{"fake": ["i64", min, max]}

Floating Point

f32 (32-bit float)

{"fake": ["f32"]}
{"fake": ["f32", min, max]}

f64 (64-bit float)

{"fake": ["f64"]}
{"fake": ["f64", min, max]}

Usage Notes

  • When using ranges, both min and max are inclusive
  • If no range is specified, the full range of the type is used
  • Float types may produce very large or very small numbers without constraints

Personal Data

Generate realistic personal information.

Names

name / full_name

Generates a complete name:

{"fake": ["name"]}
{"fake": ["full_name"]}

first_name

Generates a first name:

{"fake": ["first_name"]}

last_name

Generates a last name:

{"fake": ["last_name"]}

name_with_title

Generates a name with a professional title:

{"fake": ["name_with_title"]}

title

Generates a title/honorific (Mr., Mrs., Dr., etc.):

{"fake": ["title"]}

suffix

Generates a name suffix (Jr., Sr., III, etc.):

{"fake": ["suffix"]}

Complete User Profile

Combine personal data types for a complete profile:

Phone Numbers

phone_number

Generates a phone number:

{"fake": ["phone_number"]}

cell_number

Generates a cell phone number:

{"fake": ["cell_number"]}

Address Data

Generate realistic address components and geographic data.

Street Address

street_address

Generates a full street address:

{"fake": ["street_address"]}

street_name

Generates just the street name:

{"fake": ["street_name"]}

street_suffix

Generates a street suffix (Street, Avenue, Lane, etc.):

{"fake": ["street_suffix"]}

City and State

city / city_name

Generates a city name:

{"fake": ["city"]}
{"fake": ["city_name"]}

state_name

Generates a US state name:

{"fake": ["state_name"]}

state_abbr

Generates a US state abbreviation:

{"fake": ["state_abbr"]}

Country

country_name

Generates a country name:

{"fake": ["country_name"]}

country_code

Generates a country code:

{"fake": ["country_code"]}

Postal Codes

zip_code / zip

Generates a US ZIP code:

{"fake": ["zip_code"]}
{"fake": ["zip"]}

post_code / postcode / postal_code

Generates a postal code:

{"fake": ["post_code"]}
{"fake": ["postcode"]}
{"fake": ["postal_code"]}

Geographic Coordinates

latitude

Generates a latitude value (-90 to 90):

{"fake": ["latitude"]}

longitude

Generates a longitude value (-180 to 180):

{"fake": ["longitude"]}

Complete Address

Internet Data

Generate internet-related fake data like emails, usernames, IPs, and more.

Email Addresses

email / safe_email

Generates a safe email address (uses example.com domains):

{"fake": ["email"]}
{"fake": ["safe_email"]}

free_email

Generates an email from free email providers:

{"fake": ["free_email"]}

Usernames and Passwords

username

Generates a username:

{"fake": ["username"]}

password

Generates a password with configurable length:

{"fake": ["password"]}
{"fake": ["password", min_length, max_length]}

Domains

domain_name

Generates a domain name:

{"fake": ["domain_name"]}

domain_suffix

Generates a domain suffix (com, org, net, etc.):

{"fake": ["domain_suffix"]}

IP Addresses

ipv4

Generates an IPv4 address:

{"fake": ["ipv4"]}

ipv6

Generates an IPv6 address:

{"fake": ["ipv6"]}

Network

mac_address

Generates a MAC address:

{"fake": ["mac_address"]}

user_agent

Generates a browser user agent string:

{"fake": ["user_agent"]}

Complete Network Profile

Company Data

Generate business-related fake data.

Company Names

company_name

Generates a company name:

{"fake": ["company_name"]}

company_suffix

Generates a company suffix (Inc., LLC, Corp., etc.):

{"fake": ["company_suffix"]}

Industry and Profession

industry

Generates an industry name:

{"fake": ["industry"]}

profession

Generates a profession/job title:

{"fake": ["profession"]}

Marketing Buzzwords

catch_phrase

Generates a company catch phrase:

{"fake": ["catch_phrase"]}

bs (Business Speak)

Generates business jargon:

{"fake": ["bs"]}

bs_adj

Generates a business adjective:

{"fake": ["bs_adj"]}

bs_noun

Generates a business noun:

{"fake": ["bs_noun"]}

bs_verb

Generates a business verb:

{"fake": ["bs_verb"]}

Complete Company Profile

Finance Data

Generate financial data for testing payment systems and banking applications.

Bank Identifiers

bic

Generates a BIC (Bank Identifier Code), randomly 8 or 11 characters:

{"fake": ["bic"]}
{"fake": ["bic", 8]}   // Force 8 characters
{"fake": ["bic", 11]}  // Force 11 characters

bic8

Generates an 8-character BIC:

{"fake": ["bic8"]}

bic11

Generates an 11-character BIC (with branch code):

{"fake": ["bic11"]}

iban

Generates an IBAN with optional country code:

{"fake": ["iban"]}
{"fake": ["iban", "DE"]}  // German IBAN
{"fake": ["iban", "FR"]}  // French IBAN

lei

Generates a Legal Entity Identifier:

{"fake": ["lei"]}

Credit Cards

credit_card_number

Generates a credit card number:

{"fake": ["credit_card_number"]}

Currency

currency_code

Generates a currency code (USD, EUR, etc.):

{"fake": ["currency_code"]}

currency_name

Generates a currency name:

{"fake": ["currency_name"]}

currency_symbol

Generates a currency symbol:

{"fake": ["currency_symbol"]}

Alphanumeric

alphanumeric

Generates an alphanumeric string (useful for reference numbers):

{"fake": ["alphanumeric", length]}
{"fake": ["alphanumeric", min_length, max_length]}

Complete Transaction

Content Data

Generate text content, identifiers, and lorem ipsum text.

Identifiers

uuid

Generates a UUID v4:

{"fake": ["uuid"]}

Lorem Ipsum

word

Generates a single random word:

{"fake": ["word"]}

words

Generates multiple words (default 5):

{"fake": ["words"]}
{"fake": ["words", count]}

sentence

Generates a sentence with configurable word count:

{"fake": ["sentence"]}
{"fake": ["sentence", min_words, max_words]}

paragraph

Generates a paragraph with configurable sentence count:

{"fake": ["paragraph"]}
{"fake": ["paragraph", min_sentences, max_sentences]}

Content Generation Examples

Blog Post

Product Description

Comments

Date & Time Data

Generate dates, times, and timestamps.

Timestamps

datetime / iso8601_datetime

Generates an ISO 8601 formatted datetime:

{"fake": ["datetime"]}
{"fake": ["iso8601_datetime"]}

Date

date

Generates a date with optional format:

{"fake": ["date"]}
{"fake": ["date", format]}

Default format is %Y-%m-%d.

Common Date Formats

FormatExample
%Y-%m-%d2024-01-15
%m/%d/%Y01/15/2024
%d.%m.%Y15.01.2024
%B %d, %YJanuary 15, 2024
%Y%m%d20240115

Time

time

Generates a time in HH:MM:SS format:

{"fake": ["time"]}

Month

month_name

Generates a random month name:

{"fake": ["month_name"]}

Complete Examples

Event

Audit Log

Schedule

Other Data Types

Additional fake data types including booleans, files, barcodes, and custom choices.

Boolean

bool / boolean

Generates a random boolean:

{"fake": ["bool"]}
{"fake": ["boolean"]}

Custom Choices

enum / pick / choice

Pick a random value from provided options:

{"fake": ["enum", "option1", "option2", "option3"]}
{"fake": ["pick", "option1", "option2", "option3"]}
{"fake": ["choice", "option1", "option2", "option3"]}

regex

Generate from simple alternation patterns:

{"fake": ["regex", "(A|B|C)"]}

Note: Only supports simple alternation patterns like (A|B|C). Complex regex patterns are not supported.

File System

file_name

Generates a file name with extension:

{"fake": ["file_name"]}

file_extension

Generates a file extension:

{"fake": ["file_extension"]}

file_path

Generates a file path:

{"fake": ["file_path"]}

dir_path

Generates a directory path:

{"fake": ["dir_path"]}

Barcodes

isbn10

Generates an ISBN-10:

{"fake": ["isbn10"]}

isbn13

Generates an ISBN-13:

{"fake": ["isbn13"]}

Complete Examples

Feature Flags

E-commerce Product

Library Book

Variable System

Variables allow you to pre-generate values and reuse them across your schema. This is useful when you need the same value in multiple places or want to create relationships between fields.

Defining Variables

Variables are defined in the variables section of your configuration:

{
    "variables": {
        "userId": {"fake": ["uuid"]},
        "createdAt": {"fake": ["datetime"]}
    },
    "schema": {
        ...
    }
}

Each variable is evaluated once when generation begins, and the result is cached for reuse.

Referencing Variables

Use the var operator to reference a variable:

{"var": "variableName"}

Basic Example

In this example, userId appears in both id and audit.createdBy with the same value, and timestamp is used for both createdAt and updatedAt.

Use Cases

Timestamps for Audit Trails

Variable Scope

Variables are evaluated in the order they are defined, and each variable can only reference variables defined before it.

{
    "variables": {
        "firstName": {"fake": ["first_name"]},
        "lastName": {"fake": ["last_name"]},
        "fullName": {"cat": [{"var": "firstName"}, " ", {"var": "lastName"}]}
    },
    "schema": {
        "name": {"var": "fullName"},
        "firstName": {"var": "firstName"},
        "lastName": {"var": "lastName"}
    }
}

Variables vs Schema Values

FeatureVariablesSchema Values
EvaluatedOnce at startFor each generation
ReusableYes, via varNo
Order-dependentYesNo
Best forShared values, relationshipsUnique values per field

Tips

  1. Use variables for shared IDs - When multiple fields need the same identifier
  2. Use variables for timestamps - When created/updated times should match
  3. Use variables for names - When you need to derive email from name
  4. Keep variables simple - Complex expressions are better in the schema

JSONLogic Integration

datafake-rs is built on top of datalogic-rs, a high-performance JSONLogic implementation. This means you can use all standard JSONLogic operators alongside the fake operator.

Core Operators

Variable Access

Use var to access variables or nested data:

{"var": "variableName"}
{"var": "nested.path.to.value"}

String Concatenation

Use cat to combine strings:

{"cat": ["Hello, ", {"fake": ["name"]}, "!"]}

Conditional Logic

Use if for conditional generation:

{"if": [condition, then_value, else_value]}

Comparison Operators

OperatorDescriptionExample
==Equal{"==": [1, 1]}
!=Not equal{"!=": [1, 2]}
>Greater than{">": [5, 3]}
>=Greater or equal{">=": [5, 5]}
<Less than{"<": [3, 5]}
<=Less or equal{"<=": [5, 5]}

Logical Operators

and

All conditions must be true:

{"and": [condition1, condition2, ...]}

or

At least one condition must be true:

{"or": [condition1, condition2, ...]}

not / !

Negates a condition:

{"!": [condition]}
{"not": [condition]}

Arithmetic Operators

OperatorDescriptionExample
+Add{"+": [1, 2, 3]} → 6
-Subtract{"-": [10, 3]} → 7
*Multiply{"*": [2, 3, 4]} → 24
/Divide{"/": [10, 2]} → 5
%Modulo{"%": [10, 3]} → 1

Array Operators

map

Transform each element:

{"map": [array, transformation]}

filter

Filter elements by condition:

{"filter": [array, condition]}

reduce

Reduce array to single value:

{"reduce": [array, reducer, initial_value]}

merge

Combine arrays:

{"merge": [array1, array2]}

String Operators

cat

Concatenate strings:

{"cat": ["string1", "string2", ...]}

substr

Extract substring:

{"substr": ["string", start, length]}

Combining with fake

You can combine any JSONLogic operator with fake:

Complex Example

Learn More

For complete JSONLogic documentation, see:

Batch Generation

When you need to generate multiple records, batch generation is more efficient than calling generate multiple times.

Rust API

Basic Batch Generation

use datafake_rs::DataGenerator;

fn main() -> datafake_rs::Result<()> {
    let config = r#"{
        "schema": {
            "id": {"fake": ["uuid"]},
            "name": {"fake": ["name"]},
            "email": {"fake": ["email"]}
        }
    }"#;

    let generator = DataGenerator::from_json(config)?;

    // Generate 100 records
    let users = generator.generate_batch(100)?;

    println!("Generated {} users", users.len());
    for user in &users {
        println!("{}", user);
    }

    Ok(())
}

Reusing the Generator

The DataGenerator can be reused for multiple batch generations:

#![allow(unused)]
fn main() {
let generator = DataGenerator::from_json(config)?;

// Generate different batches
let batch1 = generator.generate_batch(10)?;
let batch2 = generator.generate_batch(20)?;
let batch3 = generator.generate_batch(50)?;
}

JavaScript/TypeScript API

Using FakeDataGenerator

import init, { FakeDataGenerator } from 'datafake-wasm';

async function main() {
    await init();

    const config = JSON.stringify({
        schema: {
            id: { fake: ["uuid"] },
            name: { fake: ["name"] },
            email: { fake: ["email"] }
        }
    });

    const gen = new FakeDataGenerator(config);

    // Generate 100 records
    const users = JSON.parse(gen.generate_batch(100));

    console.log(`Generated ${users.length} users`);

    // Clean up when done
    gen.free();
}

Memory Management

When using FakeDataGenerator in JavaScript, always call free() when you’re done to release WASM memory:

const gen = new FakeDataGenerator(config);
try {
    const batch = gen.generate_batch(1000);
    // Process batch...
} finally {
    gen.free();
}

Try It

Use the Count field to generate multiple records:

Performance Tips

1. Reuse the Generator

Creating a generator parses and validates the configuration. Reuse the same generator for multiple batches:

#![allow(unused)]
fn main() {
// Good - parse once, generate many times
let generator = DataGenerator::from_json(config)?;
for _ in 0..100 {
    let batch = generator.generate_batch(1000)?;
}

// Bad - parsing on every iteration
for _ in 0..100 {
    let generator = DataGenerator::from_json(config)?;
    let batch = generator.generate_batch(1000)?;
}
}

2. Use Appropriate Batch Sizes

Larger batches are more efficient due to reduced function call overhead:

#![allow(unused)]
fn main() {
// Better - one batch of 10,000
let batch = generator.generate_batch(10000)?;

// Worse - 100 batches of 100
for _ in 0..100 {
    let batch = generator.generate_batch(100)?;
}
}

3. Minimize Complex Expressions

Simple schemas generate faster than complex nested expressions:

// Fast
{"schema": {"id": {"fake": ["uuid"]}, "name": {"fake": ["name"]}}}

// Slower (many nested operations)
{"schema": {"data": {"if": [{"==": [{"fake": ["bool"]}, true]}, ...]}}}

Streaming Large Batches

For very large datasets, consider generating in chunks to manage memory:

#![allow(unused)]
fn main() {
let generator = DataGenerator::from_json(config)?;

// Generate 1 million records in chunks
let chunk_size = 10000;
let total_records = 1_000_000;

for chunk in 0..(total_records / chunk_size) {
    let batch = generator.generate_batch(chunk_size)?;
    // Process or write batch to file/database
    process_batch(&batch)?;
}
}

Output Format

Batch generation returns a JSON array:

[
    {"id": "...", "name": "Alice Smith", "email": "alice@example.com"},
    {"id": "...", "name": "Bob Jones", "email": "bob@example.com"},
    {"id": "...", "name": "Carol Brown", "email": "carol@example.com"}
]

Each record in the batch has independently generated values - no two records share fake data (unless using variables, which are regenerated for each record in the batch).

API Reference

Rust API

DataGenerator

The main entry point for generating fake data.

#![allow(unused)]
fn main() {
use datafake_rs::DataGenerator;
}

Constructors

from_json

Create a generator from a JSON string:

#![allow(unused)]
fn main() {
pub fn from_json(json_str: &str) -> Result<Self>
}

Example:

#![allow(unused)]
fn main() {
let config = r#"{"schema": {"id": {"fake": ["uuid"]}}}"#;
let generator = DataGenerator::from_json(config)?;
}
from_value

Create a generator from a serde_json::Value:

#![allow(unused)]
fn main() {
pub fn from_value(json_value: Value) -> Result<Self>
}

Example:

#![allow(unused)]
fn main() {
use serde_json::json;

let config = json!({
    "schema": {
        "id": {"fake": ["uuid"]}
    }
});
let generator = DataGenerator::from_value(config)?;
}
new

Create a generator from a DataFakeConfig:

#![allow(unused)]
fn main() {
pub fn new(config: DataFakeConfig) -> Self
}

Methods

generate

Generate a single record:

#![allow(unused)]
fn main() {
pub fn generate(&self) -> Result<Value>
}

Example:

#![allow(unused)]
fn main() {
let result = generator.generate()?;
println!("{}", serde_json::to_string_pretty(&result)?);
}
generate_batch

Generate multiple records:

#![allow(unused)]
fn main() {
pub fn generate_batch(&self, count: usize) -> Result<Vec<Value>>
}

Example:

#![allow(unused)]
fn main() {
let results = generator.generate_batch(100)?;
for result in results {
    println!("{}", result);
}
}

DataFakeConfig

Configuration structure for the generator.

#![allow(unused)]
fn main() {
pub struct DataFakeConfig {
    pub metadata: Option<Metadata>,
    pub variables: HashMap<String, Value>,
    pub schema: Value,
}
}

Metadata

Optional metadata for the configuration.

#![allow(unused)]
fn main() {
pub struct Metadata {
    pub name: Option<String>,
    pub version: Option<String>,
    pub description: Option<String>,
    pub extra: HashMap<String, Value>,
}
}

Error Types

#![allow(unused)]
fn main() {
pub enum DataFakeError {
    ConfigParse(String),
    InvalidConfig(String),
    VariableNotFound(String),
    JsonError(serde_json::Error),
    FakeOperatorError(String),
    TypeConversion(String),
    InvalidLocale(String),
    InvalidRange(String),
}
}
ErrorDescription
ConfigParseFailed to parse JSON configuration
InvalidConfigConfiguration is missing required fields
VariableNotFoundReferenced variable doesn’t exist
JsonErrorJSON serialization/deserialization error
FakeOperatorErrorInvalid fake operator or arguments
TypeConversionType conversion failed
InvalidLocaleInvalid locale specified
InvalidRangeInvalid numeric range

WebAssembly API

generate

One-off generation from a configuration string:

function generate(config: string): string

Parameters:

  • config - JSON string containing the datafake configuration

Returns:

  • JSON string of the generated data

Throws:

  • Error if configuration is invalid

Example:

import init, { generate } from 'datafake-wasm';

await init();

const config = JSON.stringify({
    schema: { id: { fake: ["uuid"] } }
});

const result = JSON.parse(generate(config));

FakeDataGenerator

Reusable generator class for multiple generations.

Constructor

new FakeDataGenerator(config: string)

Parameters:

  • config - JSON string containing the datafake configuration

Throws:

  • Error if configuration is invalid

Methods

generate

Generate a single record:

generate(): string

Returns:

  • JSON string of the generated data
generate_batch

Generate multiple records:

generate_batch(count: number): string

Parameters:

  • count - Number of records to generate

Returns:

  • JSON string of an array containing the generated records
free

Release WASM memory:

free(): void

Important: Always call free() when done to prevent memory leaks.

Example:

import init, { FakeDataGenerator } from 'datafake-wasm';

await init();

const config = JSON.stringify({
    schema: {
        id: { fake: ["uuid"] },
        name: { fake: ["name"] }
    }
});

const gen = new FakeDataGenerator(config);

try {
    const single = JSON.parse(gen.generate());
    const batch = JSON.parse(gen.generate_batch(10));
} finally {
    gen.free();
}

Configuration Schema

Full Configuration

{
    "metadata": {
        "name": "string (optional)",
        "version": "string (optional)",
        "description": "string (optional)",
        "...": "any additional fields"
    },
    "variables": {
        "variableName": "<JSONLogic expression>"
    },
    "schema": "<JSONLogic expression>"
}

fake Operator

{"fake": ["type"]}
{"fake": ["type", arg1, arg2, ...]}

See Fake Data Types for all available types.

var Operator

Reference a variable:

{"var": "variableName"}
{"var": "nested.path"}

Complete Example

use datafake_rs::{DataGenerator, Result};

fn main() -> Result<()> {
    let config = r#"{
        "metadata": {
            "name": "User Generator",
            "version": "1.0.0"
        },
        "variables": {
            "userId": {"fake": ["uuid"]},
            "createdAt": {"fake": ["datetime"]}
        },
        "schema": {
            "id": {"var": "userId"},
            "profile": {
                "name": {"fake": ["name"]},
                "email": {"fake": ["email"]},
                "age": {"fake": ["u8", 18, 65]}
            },
            "metadata": {
                "createdAt": {"var": "createdAt"},
                "createdBy": {"var": "userId"}
            }
        }
    }"#;

    let generator = DataGenerator::from_json(config)?;

    // Generate single record
    let user = generator.generate()?;
    println!("Single user: {}", serde_json::to_string_pretty(&user)?);

    // Generate batch
    let users = generator.generate_batch(10)?;
    println!("Generated {} users", users.len());

    Ok(())
}