Human Vocality Primitives

Plosives & Non-Lexical Consonant Bursts

Unvoiced consonant transients including plosives, clicks, bursts, and shaped onsets.

Structured at the articulation level using documented production workflows and secured under the Proteus Standard™.

FULL DATASET

Human Vocality Primitives

Voice and Vocal Techniques

Voice

Contact for licensing View Preview on Hugging Face

This dataset is currently in production. Preview audio and full specifications will be added as they become available.

This is a preview release provided via Hugging Face for evaluation purposes. Initial recording passes are available to assess capture quality, labeling structure, and dataset relevance. The full dataset will be delivered privately with complete Proteus provenance, integrity, and documentation upon licensing.

This dataset is complete and available for licensing. A public preview subset will be released via Hugging Face on a rolling basis.

Plosives and non-lexical consonant bursts are airflow-driven vocal events produced without sustained vocal fold vibration, relying instead on brief oral constriction and rapid pressure release to generate sound. In these events, acoustic energy is created through short transient airflow bursts rather than sustained turbulence or harmonic excitation, resulting in discrete noise-based impulses with no stable fundamental pitch. Expressive control emerges through modulation of pressure buildup, release force, articulator placement, and timing rather than changes in loudness or pitch.

Acoustically, plosive and non-lexical consonant bursts occupy a broadband, transient-dominant timbral space characterized by short-duration energy spikes, noise-based release characteristics, and fine-grained temporal precision. Expressivity is conveyed through articulatory contact location, release intensity, ingress versus egress airflow, and burst timing rather than through sustained airflow texture or vocal tract resonance. These characteristics make plosives and non-lexical consonant bursts particularly well suited for modeling transient unvoiced vocal behavior, articulatory onset modeling, and airflow-centric acoustic analysis, where precise control of burst timing, release behavior, and repeatable articulatory events is essential.

Dataset Overview

Key technical details for this dataset — including file counts, duration, delivery format, and session context.

Planned technical specifications and recording standards for this dataset.

Total Files:

Total Files (Preview):

Total Duration (Hours):

0.01

Sample Rate (Hz):

96000

Bit Depth (Delivery):

Dataset Version:

v1.0

Recording Environment:

Treated Studio

Microphone Configuration:

Rode NT1-A positioned 4-5 inches from mouth

Performer:

Blake Pullen

Recording Dates:

Jan. 30th, 2026

Recording Location:

Las Vegas, NV

Produced using standardized capture, editing, and QC protocols with versioned metadata and Proteus-backed provenance.

Content and Recording Details

An overview of what’s included in this dataset — from articulations and performance styles to session context and recording notes.

What's in this dataset

This dataset contains a comprehensive collection of recordings capturing plosive and non-lexical consonant burst gestures, recorded in isolation to emphasize transient, unvoiced airflow events. The material is designed to document the structural organization, capture quality, and articulatory taxonomy of plosive-like burst behaviors as foundational consonant primitives, rather than convey voiced sound, semantic speech content, or linguistic meaning.

Included recordings emphasize brief bilabial pressure releases, tongue-tip and posterior tongue burst events, unshaped oral air bursts, compressed noise releases, and ingressive intake-style bursts. All material is presented in a neutral, non-performative context to support articulatory modeling, speech research, and airflow-centric audio analysis workflows. Together, the dataset provides a structured, repeatable corpus suitable for studying and modeling transient unvoiced consonant behavior across a range of articulator locations, pressure levels, and release dynamics.

Recording & Session Notes

All audio was recorded in a controlled studio environment using standardized capture, editing, and QC protocols applied consistently across the Harmonic Frontier Audio catalog.

Source material was captured at 32-bit float to preserve full dynamic headroom and minimize quantization artifacts during editing and processing. Final dataset files are delivered as 24-bit PCM for consistency and downstream compatibility.

A single performer and vocal source were used consistently across all sessions to maintain physiological continuity, articulatory stability, and repeatable transient airflow behavior.

Vocal source details:
Human voice — non-lexical plosive and consonant burst techniques

Additional processing was limited to trimming, fade handling, and integrity checks. No creative processing, normalization, compression, or dynamic shaping was applied beyond what was necessary for clean, faithful delivery of the recorded material.

Techniques, Articulations & Gesture Types

A structured breakdown of the expressive building blocks in this dataset — including articulations, dynamics, transitions, and any extended techniques captured during recording.

Unlike clip- or phrase-based datasets, this dataset is structured at the articulation and gesture level. This enables interpretable control, expressive variability, and human-aligned modeling, but significantly increases production complexity and significantly limits who can produce such datasets correctly at scale.

Articulations Included

This dataset includes a structured set of plosive and non-lexical consonant burst articulations, recorded in isolation to support articulation-aware modeling and analysis of transient, unvoiced vocal behavior.

Articulations include:

• Bilabial pressure-release bursts produced through brief lip closure and release
• Tongue-tip and alveolar burst events generated by rapid articulatory release
• Posterior tongue and velar release bursts with controlled pressure buildup
• Unshaped and lightly constricted oral air bursts producing non-sibilant noise
• Ingressive intake-style burst events produced without speech intent

Articulations are recorded without voicing, linguistic content, accompaniment, or rhythmic framing to preserve clarity, separability, and modeling utility across speech research, articulatory modeling, and airflow-centric analysis contexts.

Extended Techniques & Gesture Types

This dataset includes a focused set of gesture-level vocal behaviors capturing non-lexical plosive and consonant burst techniques, with an emphasis on pressure buildup, articulatory contact, rapid release, and transitions at the threshold between silence and transient sound.

Gesture types include:

• Micro-variation in pressure buildup and release intensity across repeated burst events
• Articulatory contact and release at bilabial, alveolar, and posterior tongue locations
• Subtle changes in oral constriction affecting burst noise texture and release sharpness
• Controlled transitions from silence into brief burst events and back to silence
• Near-threshold burst behaviors where articulatory release, airflow noise, and room tone converge

Gestures are recorded in isolation and without linguistic framing to preserve clarity, repeatability, and modeling utility, supporting detailed analysis of transient unvoiced vocal behavior and articulatory event generation across speech research, articulatory modeling, and airflow-centric analysis contexts.

Proteus Standard Compliance

This dataset was recorded, documented, and released under The Proteus Standard™, Harmonic Frontier Audio’s framework for rights-cleared, provenance-audited audio data.

The Proteus Standard ensures:
‍
•Performer-owned, contract-clean source material
•Transparent recording methodology and metadata
•Consistent capture, QC, and documentation practices across the catalog

Learn more about The Proteus Standard

Layer I — Source Provenance

Layer II — Cryptographic Integrity

Layer III — Acoustic Fingerprinting

Performers

Captured with expert musicians and vocalists across global traditions — ensuring each dataset carries authentic nuance, human expression, and rights-managed provenance.

Blake Pullen

Blake Pullen is a multidisciplinary vocalist, musician, and recording engineer with a background spanning formal vocal performance, traditional acoustic music, and high-fidelity audio production.

With formal training in vocal performance and extensive experience recording both instruments and extended vocal techniques, Blake approaches dataset creation from a physiological, acoustic, and systems-oriented perspective. His work emphasizes articulatory precision, repeatability across sessions, and capture practices designed to support long-term machine learning, speech research, and expressive voice modeling rather than performative presentation.

As the founder of Harmonic Frontier Audio, he performs and records the initial datasets to establish a consistent technical and methodological foundation for the catalog, ensuring that vocal capture techniques, articulation taxonomy, and provenance standards are applied rigorously from the outset.

Audio Demonstrations

A three-part listening benchmark: a mixed musical demo built from this dataset, the raw source clip, and an AI model’s attempt to reproduce the same prompt.

PRODUCED REFERENCE

A musical demonstration created by replacing a state-of-the-art AI-generated lead instrument with original source recordings from this dataset, then arranged and mastered to preserve musical context. This approach allows direct comparison between current-generation model output and real, rights-cleared acoustic source material.

RAW DATASET CLIP

Directly from the dataset: an isolated, unprocessed example of the source recording.

AI MODEL BENCHMARK (Suno v5 Pro Beta)

An unmodified output from a current-gen AI model given the same musical prompt. Included to illustrate where today’s systems still differ from real, recorded sources.

AI model approximations generated using publicly available state-of-the-art music generation systems.

Interested in licensing this dataset?

Harmonic Frontier Audio datasets are licensed directly to research teams, startups, and enterprise partners. Access models and terms vary based on use case, scale, and integration needs.

Request licensing details View the full HFA catalog

All datasets are delivered with versioned metadata, documented workflows, and Proteus-backed integrity manifests.

We typically respond to inquiries within 1–2 business days.

This dataset is currently in production

This dataset is actively being recorded and prepared. You can request early access, previews, or discuss licensing timelines.

Get notified or request early access View the HFA catalog

We typically respond within 1–2 business days.