Human Vocality Primitives
A comprehensive collection of natural inhale–exhale cycles, emotional breaths, and physiological airflow patterns captured with clinical clarity for expressive audio modeling.
Structured at the articulation level using documented production workflows and secured under the Proteus Standard™.
Breathing cycles and physiological breathing patterns are airflow-driven respiratory behaviors produced without vocal fold engagement, relying instead on coordinated inhalation and exhalation, breath depth, timing, and natural pauses to generate sound. In these behaviors, acoustic energy is created through cyclical airflow rather than phonation or articulation, resulting in audible breath movement with no pitch, voicing, or harmonic excitation. Expressive control emerges through modulation of breathing rate, depth, regularity, and transitions between inhale, exhale, holds, and silence rather than changes in loudness or pitch.
Acoustically, breathing cycles and physiological patterns occupy a broadband, noise-dominant timbral space characterized by diffuse spectral energy, cycle-level temporal structure, and natural variability across repeated breaths. Expressivity is conveyed through breathing rhythm, inhale–exhale balance, pause placement, and subtle irregularities rather than through vocal tract shaping or articulatory gesture. These characteristics make breathing cycles and physiological patterns particularly well suited for modeling natural respiratory behavior, embodied and breath-aware systems, and airflow-centric acoustic analysis, where precise control of breathing cadence, cycle transitions, and repeatable physiological patterns is essential.
Key technical details for this dataset — including file counts, duration, delivery format, and session context.
Planned technical specifications and recording standards for this dataset.
Total Files:
31
Total Files (Preview):
Total Duration (Hours):
Sample Rate (Hz):
96000
Bit Depth (Delivery):
24
Dataset Version:
v1.0
Recording Environment:
Treated Studio
Microphone Configuration:
Rode NT1-A positioned 3-4 inches from mouth
Performer:
Blake Pullen
Recording Dates:
Jan. 30th, 2026
Recording Location:
Las Vegas, NV
Produced using standardized capture, editing, and QC protocols with versioned metadata and Proteus-backed provenance.
An overview of what’s included in this dataset — from articulations and performance styles to session context and recording notes.
This dataset contains a comprehensive collection of recordings capturing breathing cycles and physiological respiratory patterns, recorded in isolation to emphasize natural airflow-driven breathing behavior. The material is designed to document the structural organization, capture quality, and temporal organization of breathing cycles as foundational respiratory primitives, rather than convey voiced sound, semantic content, or linguistic meaning.
Included recordings emphasize neutral breathing at relaxed and deeper depths, slow and extended breathing cycles, faster and shallow breathing patterns, brief pauses and holds at inhale or exhale boundaries, and irregular or transitional breathing events. All material is presented in a neutral, non-performative context to support airflow modeling, embodied systems research, and airflow-centric audio analysis workflows. Together, the dataset provides a structured, repeatable corpus suitable for studying and modeling natural respiratory behavior across a range of breathing rates, depths, and physiological timing patterns.
All audio was recorded in a controlled studio environment using standardized capture, editing, and QC protocols applied consistently across the Harmonic Frontier Audio catalog.
Source material was captured at 32-bit float to preserve full dynamic headroom and minimize quantization artifacts during editing and processing. Final dataset files are delivered as 24-bit PCM for consistency and downstream compatibility.
A single performer and respiratory source were used consistently across all sessions to maintain physiological continuity, breathing pattern stability, and repeatable respiratory behavior.
Vocal source details:
Human respiratory system — non-phonated breathing cycles and physiological patterns
Additional processing was limited to trimming, fade handling, and integrity checks. No creative processing, normalization, compression, or dynamic shaping was applied beyond what was necessary for clean, faithful delivery of the recorded material.
A structured breakdown of the expressive building blocks in this dataset — including articulations, dynamics, transitions, and any extended techniques captured during recording.
Unlike clip- or phrase-based datasets, this dataset is structured at the articulation and gesture level. This enables interpretable control, expressive variability, and human-aligned modeling, but significantly increases production complexity and significantly limits who can produce such datasets correctly at scale.
This dataset includes a structured set of breathing cycle and physiological respiratory articulations, recorded in isolation to support respiration-aware modeling and analysis of natural breathing behavior.
Articulations include:
• Neutral breathing cycles at relaxed and slightly deeper depths
• Slow and extended breathing cycles with lengthened inhale and exhale phases
• Faster and shallower breathing cycles with increased rate and reduced depth
• Breathing patterns incorporating brief inhale or exhale holds
• Irregular and transitional breathing events reflecting natural physiological variability
Articulations are recorded without voicing, linguistic content, accompaniment, or rhythmic framing to preserve clarity, separability, and modeling utility across airflow modeling, embodied systems research, and airflow-centric analysis contexts.
This dataset includes a focused set of gesture-level respiratory behaviors capturing natural breathing cycles and physiological breathing patterns, with an emphasis on breath timing, depth modulation, cycle pacing, and transitions between inhale, exhale, holds, and silence.
Gesture types include:
• Micro-variation in breathing rate and depth across repeated inhale–exhale cycles
• Gradual extension and compression of inhale and exhale phases during slow or faster breathing patterns
• Subtle shifts in breathing regularity reflecting natural physiological variability
• Controlled pauses and holds at inhale or exhale boundaries
• Near-threshold respiratory behaviors where breath movement, room tone, and silence converge
Gestures are recorded in isolation and without linguistic framing to preserve clarity, repeatability, and modeling utility, supporting detailed analysis of natural respiratory behavior and breath-driven sound generation across airflow modeling and embodied systems research contexts.
This dataset was recorded, documented, and released under The Proteus Standard™, Harmonic Frontier Audio’s framework for rights-cleared, provenance-audited audio data.
The Proteus Standard ensures:
•Performer-owned, contract-clean source material
•Transparent recording methodology and metadata
•Consistent capture, QC, and documentation practices across the catalog
Learn more about The Proteus Standard
Captured with expert musicians and vocalists across global traditions — ensuring each dataset carries authentic nuance, human expression, and rights-managed provenance.

Blake Pullen is a multidisciplinary vocalist, musician, and recording engineer with a background spanning formal vocal performance, traditional acoustic music, and high-fidelity audio production.
With formal training in vocal performance and extensive experience recording both instruments and extended vocal techniques, Blake approaches dataset creation from a physiological, acoustic, and systems-oriented perspective. His work emphasizes articulatory precision, repeatability across sessions, and capture practices designed to support long-term machine learning, speech research, and expressive voice modeling rather than performative presentation.
As the founder of Harmonic Frontier Audio, he performs and records the initial datasets to establish a consistent technical and methodological foundation for the catalog, ensuring that vocal capture techniques, articulation taxonomy, and provenance standards are applied rigorously from the outset.
A three-part listening benchmark: a mixed musical demo built from this dataset, the raw source clip, and an AI model’s attempt to reproduce the same prompt.
A musical demonstration created by replacing a state-of-the-art AI-generated lead instrument with original source recordings from this dataset, then arranged and mastered to preserve musical context. This approach allows direct comparison between current-generation model output and real, rights-cleared acoustic source material.
Directly from the dataset: an isolated, unprocessed example of the source recording.
An unmodified output from a current-gen AI model given the same musical prompt. Included to illustrate where today’s systems still differ from real, recorded sources.
AI model approximations generated using publicly available state-of-the-art music generation systems.
Harmonic Frontier Audio datasets are licensed directly to research teams, startups, and enterprise partners. Access models and terms vary based on use case, scale, and integration needs.
All datasets are delivered with versioned metadata, documented workflows, and Proteus-backed integrity manifests.
We typically respond to inquiries within 1–2 business days.
This dataset is actively being recorded and prepared. You can request early access, previews, or discuss licensing timelines.
We typically respond within 1–2 business days.