speech brain

speechbrain 1.0.0

pip install speechbrain Copy PIP instructions

Released: Feb 26, 2024

All-in-one speech toolkit in pure Python and Pytorch

Verified details

Maintainers.

Unverified details

Project links, github statistics.

View statistics for this project via Libraries.io , or by using our public dataset on Google BigQuery

License: Apache Software License

Author: Mirco Ravanelli & Others

Requires: Python >=3.7

Classifiers

OSI Approved :: Apache Software License
Python :: 3

Project description

Exciting News (January, 2024): Discover what is new in SpeechBrain 1.0 here !

🗣️💬 What SpeechBrain Offers

SpeechBrain is an open-source PyTorch toolkit that accelerates Conversational AI development, i.e., the technology behind speech assistants , chatbots , and large language models .

It is crafted for fast and easy creation of advanced technologies for Speech and Text Processing.

With the rise of deep learning , once-distant domains like speech processing and NLP are now very close. A well-designed neural network and large datasets are all you need.

We think it is now time for a holistic toolkit that, mimicking the human brain, jointly supports diverse technologies for complex Conversational AI systems.

This spans speech recognition , speaker recognition , speech enhancement , speech separation , language modeling , dialogue , and beyond.

📚 Training Recipes

We share over 200 competitive training recipes on more than 40 datasets supporting 20 speech and text processing tasks (see below).

We support both training from scratch and fine-tuning pretrained models such as Whisper , Wav2Vec2 , WavLM , Hubert , GPT2 , Llama2 , and beyond. The models on HuggingFace can be easily plugged in and fine-tuned.

For any task, you train the model using these commands:

The hyperparameters are encapsulated in a YAML file, while the training process is orchestrated through a Python script.

We maintained a consistent code structure across different tasks.

For better replicability, training logs and checkpoints are hosted on Dropbox.

Pretrained Models and Inference

Access over 100 pretrained models hosted on HuggingFace .
Each model comes with a user-friendly interface for seamless inference. For example, transcribing speech using a pretrained model requires just three lines of code:

Documentation

We are deeply dedicated to promoting inclusivity and education.
We have authored over 30 tutorials on Google Colab that not only describe how SpeechBrain works but also help users familiarize themselves with Conversational AI.
Every class or function has clear explanations and examples that you can run. Check out the documentation for more details 📚.

🎯 Use Cases

🚀 Research Acceleration : Speeding up academic and industrial research. You can develop and integrate new models effortlessly, comparing their performance against our baselines.

⚡️ Rapid Prototyping : Ideal for quick prototyping in time-sensitive projects.

🎓 Educational Tool : SpeechBrain's simplicity makes it a valuable educational resource. It is used by institutions like Mila , Concordia University , Avignon University , and many others for student training.

🚀 Quick Start

To get started with SpeechBrain, follow these simple steps:

🛠️ Installation

Install via pypi.

Install SpeechBrain using PyPI:

Access SpeechBrain in your Python code:

Install from GitHub

This installation is recommended for users who wish to conduct experiments and customize the toolkit according to their needs.

Clone the GitHub repository and install the requirements:

Any modifications made to the speechbrain package will be automatically reflected, thanks to the --editable flag.

✔️ Test Installation

Ensure your installation is correct by running the following commands:

🏃‍♂️ Running an Experiment

In SpeechBrain, you can train a model for any task using the following steps:

The results will be saved in the output_folder specified in the YAML file.

📘 Learning SpeechBrain

Website: Explore general information on the official website .

Tutorials: Start with basic tutorials covering fundamental functionalities. Find advanced tutorials and topics in the Tutorials menu on the SpeechBrain website .

Documentation: Detailed information on the SpeechBrain API, contribution guidelines, and code is available in the documentation .

🔧 Supported Technologies

SpeechBrain is a versatile framework designed for implementing a wide range of technologies within the field of Conversational AI.
It excels not only in individual task implementations but also in combining various technologies into complex pipelines.

🎙️ Speech/Audio Processing

📝 text processing, 🔍 additional features.

SpeechBrain includes a range of native functionalities that enhance the development of Conversational AI technologies. Here are some examples:

Training Orchestration: The Brain class serves as a fully customizable tool for managing training and evaluation loops over data. It simplifies training loops while providing the flexibility to override any part of the process.

Hyperparameter Management: A YAML-based hyperparameter file specifies all hyperparameters, from individual numbers (e.g., learning rate) to complete objects (e.g., custom models). This elegant solution drastically simplifies the training script.

Dynamic Dataloader: Enables flexible and efficient data reading.

GPU Training: Supports single and multi-GPU training, including distributed training.

Dynamic Batching: On-the-fly dynamic batching enhances the efficient processing of variable-length signals.

Mixed-Precision Training: Accelerates training through mixed-precision techniques.

Efficient Data Reading: Reads large datasets efficiently from a shared Network File System (NFS) via WebDataset .

Hugging Face Integration: Interfaces seamlessly with HuggingFace for popular models such as wav2vec2 and Hubert.

Orion Integration: Interfaces with Orion for hyperparameter tuning.

Speech Augmentation Techniques: Includes SpecAugment, Noise, Reverberation, and more.

Data Preparation Scripts: Includes scripts for preparing data for supported datasets.

SpeechBrain is rapidly evolving, with ongoing efforts to support a growing array of technologies in the future.

📊 Performance

SpeechBrain integrates a variety of technologies, including those that achieves competitive or state-of-the-art performance.

For a comprehensive overview of the achieved performance across different tasks, datasets, and technologies, please visit here .

SpeechBrain is released under the Apache License, version 2.0 , a popular BSD-like license.
You are free to redistribute SpeechBrain for both free and commercial purposes, with the condition of retaining license headers. Unlike the GPL, the Apache License is not viral, meaning you are not obligated to release modifications to the source code.

🔮Future Plans

We have ambitious plans for the future, with a focus on the following priorities:

Scale Up: Our aim is to provide comprehensive recipes and technologies for training massive models on extensive datasets.

Scale Down: While scaling up delivers unprecedented performance, we recognize the challenges of deploying large models in production scenarios. We are focusing on real-time, streamable, and small-footprint Conversational AI.

🤝 Contributing

SpeechBrain is a community-driven project, led by a core team with the support of numerous international collaborators.
We welcome contributions and ideas from the community. For more information, check here .
SpeechBrain is an academically driven project and relies on the passion and enthusiasm of its contributors.
As we cannot rely on the resources of a large company, we deeply appreciate any form of support, including donations or collaboration with the core team.
If you're interested in sponsoring SpeechBrain, please reach out to us at [email protected] .
A heartfelt thank you to all our sponsors, including the current ones:

📖 Citing SpeechBrain

If you use SpeechBrain in your research or business, please cite it using the following BibTeX entry:

Project details

Release history release notifications | rss feed.

Feb 26, 2024

Nov 22, 2023

Jul 22, 2023

Mar 24, 2023

Aug 29, 2022

Jun 26, 2022

Dec 20, 2021

Sep 11, 2021

Jun 17, 2021

Jun 6, 2021

Apr 29, 2021

Apr 19, 2021

Apr 5, 2021

Mar 15, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages .

Source Distribution

Uploaded Feb 26, 2024 Source

Built Distribution

Uploaded Feb 26, 2024 Python 3

Hashes for speechbrain-1.0.0.tar.gz

Hashes for speechbrain-1.0.0-py3-none-any.whl.

português (Brasil)

Supported by

SpeechBrain

A PyTorch Powered Speech Toolkit

Key Features

SpeechBrain is an open-source and all-in-one conversational AI toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains.

Speech Recognition

SpeechBrain supports state-of-the-art methods for end-to-end speech recognition, including models based on CTC, CTC+attention, transducers, transformers, and neural language models relying on recurrent neural networks and transformers.

Speaker Recognition

Speaker recognition is already deployed in a wide variety of realistic applications. SpeechBrain provides different models for speaker recognition, including X-vector, ECAPA-TDNN, PLDA, contrastive learning

Speech Enhancement

Spectral masking, spectral mapping, and time-domain enhancement are different methods already available within SpeechBrain. Separation methods such as Conv-TasNet, DualPath RNN, and SepFormer are implemented as well.

Speech Processing

SpeechBrain provides efficient and GPU-friendly speech augmentation pipelines and acoustic features extraction, normalisation that can be used on-the-fly during your experiment.

Multi Microphone Processing

Combining multiple microphones is a powerful approach to achieve robustness in adverse acoustic environments. SpeechBrain provides various techniques for beamforming (e.g, delay-and-sum, MVDR, and GeV) and speaker localization.

Text-to-Speech

Text-to-Speech (TTS, also known as Speech Synthesis) allows users to generate speech signals from an input text. SpeechBrain supports popular models for TTS (e.g., Tacotron2) and Vocoders (e.g, HiFIGAN).

Research & Development

SpeechBrain is designed to speed-up research and development of speech technologies. It is modular, flexible, easy-to-customize, and contains several recipes for popular datasets. Documentation and tutorials are here to help newcomers using SpeechBrain.

HuggingFace!

SpeechBrain provides multiple pre-trained models that can easily be deployed with nicely designed interfaces. Transcribing, verifying speakers, enhancing speech, separating sources have never been that easy!

Why SpeechBrain?

SpeechBrain allows you to easily and quickly customize any part of your speech pipeline ranging from the data management up to the downstream task metric. No existing speech toolkit provides such a level of accessibility.

Easy to install
Easy to use
Easy to customize

Adapts to your needs.

Speechbrain allows users to install either via pypi to rapidly use the standard library or via a local install to view recipes and further explore the features of the toolkit., a single command., every speechbrain recipe relies on a yaml file that summarizes all the functions and hyperparameters of the system. a single python script combines them to implement the desired task., built for research., speechbrain is designed for research and development. hence, flexibility and transparency are core concepts to facilitate our daily work. you can define your own deep learning models, losses, training / evaluation loops, input pipeline / transformations and use them handily without overhead..

They are, or they sponsored SpeechBrain!

Our new call for sponsors (2022) is now open.

Previous Sponsors

Collaborators

Core Research
Publications
Canada CIFAR AI Chairs
Causal Cell Dynamics Lab
Open Source Software
Life at Mila
Graduate Research Programs
Supervision Requests
Professional Programs
Courses and schedules
Lab Representatives
Mila and industry
AI Activation Program
Applied Machine Learning Research
Partnerships
Mila Entrepreneurship Lab
TRAIL Industry
AI and Human Rights Conference
AI4Good Lab
AI Policy Compass
Summer School in Responsible AI
AIMS Project
Biasly Project
DISA Project
FLAIR Initiative
Infrared Project
AIR Project
Leadership Team
Board of Directors
Impact Reports
Job Openings

Introducing SpeechBrain: A general-purpose PyTorch speech processing toolkit

About the Mila Blog

What is SpeechBrain?

SpeechBrain is an open-source and all-in-one speech toolkit . It is designed to make the research and development of neural speech processing technologies easier by being simple , flexible , user-friendly , and well-documented .

We designed it to natively support multiple speech tasks of common interest, including:

Speech Recognition , i.e. speech-to-text.
Spoken Language Understanding , i.e. speech-to-semantics.
Speaker Recognition , i.e. identifying or verifying speaker identities from speech recordings.
Speech Enhancement , i.e. improving the quality of the speech signal by removing noise.
Speech Separation , i.e. separating multiple speakers speaking at the same time.
Speaker Diarization , i.e. detecting who spoke when.
Multi-microphone signal processing , i.e. combining the information recorded by multiple microphones.

Many other tasks such as text-to-speech, sound event classification, and self-supervised learning will be supported soon. The toolkit provides training recipes for popular speech datasets. Pre-trained models are released on Hugging Face ( https://huggingface.co/speechbrain/ ) , along with intuitive functionalities for inference and fine-tuning. To help beginners familiarize themselves with the toolkit, we wrote several tutorials on Google Colab ( https://speechbrain.github.io/tutorial_basics.html ) . SpeechBrain is released under the Apache License, version 2.0.

Website: https://speechbrain.github.io/ GitHub: https://github.com/speechbrain/speechbrain

The availability of open-source software is playing a remarkable role in the deep learning community, as was demonstrated with Theano [1] and its Deep Learning Tutorials [2] in the early years of deep learning. Nowadays, one of the most commonly used toolkits is PyTorch [3], thanks to its modern and flexible design that supports GPU-based tensor computations and facilitates the development of dynamically structured neural architectures with proper routines for automatic gradient computation.

In parallel to general-purpose deep learning software, some speech processing toolkits have also gained popularity within the research community. Most of these toolkits are limited to specific speech tasks. For instance, Kaldi [4] is an established framework used to develop state-of-the-art speech recognizers.

Even though many of these frameworks work well for the specific task for which they are designed, our experience in the field suggests that having a single , efficient , and flexible toolkit can significantly speed up the research and development of speech and audio processing techniques. It is thus significantly easier to familiarize oneself with a single toolkit than to learn several different frameworks, considering that all state-of-the-art speech processing techniques share the same underlying technology: deep learning . SpeechBrain therefore consolidates all speech processing tasks within a single toolkit for the benefit of the research community.

Only recently, some excellent speech toolkits able to support different speech tasks have been publicly released. Examples are ESPNET [5] and NeMo [6]. Along this line, we recently released SpeechBrain , which we designed from scratch with the goal of making it simple, flexible and modular . We want SpeechBrain to be suitable for education purposes as well. We thus put major efforts towards rich documentation and tutorials , to help beginners familiarize themselves with our toolkit.

Usage Example

You can simply install SpeechBrain in this way:

If you prefer a local installation, you can type:

Inference with a pre-trained model

Once installed, we can start playing with it. Let’s see first how easy it is to use one of our pre-trained models stored on Hugging Face ( https://huggingface.co/speechbrain ). For instance, you can use a speech recognition model (trained on LibriSpeech) to transcribe an audio recording:

You can also perform speaker verification, to check if two recordings come from the same speakers or different ones.

We also provide some pre-training models for speech separation (using the SepFormer architecture):

We support many other tasks (see https://huggingface.co/speechbrain/ ). As you can see, you can easily use pre-trained SpeechBrain models with just a few lines of code.

Training a model

In addition to commonly used speech processing building blocks and pre-trained models, SpeechBrain comes with many recipes for training state-of-the-art speech models from scratch on a variety of tasks.

If you go into the main project folder, you can type:

Where the dataset is the corpus that you would like to use for training (e.g., LibriSpeech) and the task is the speech task we want to solve with this dataset (e.g., automatic speech recognition).

Then, we run a simple command, like this:

To train and test a model. All the hyperparameters are summarized in a yaml file, while the main script for training is train.py .

yaml allows us to specify the hyperparameters in an elegant, flexible, and transparent way. Let’s see for instance this yaml snippet:

As you can see, this is not just a plain list of hyperparameters. For each parameter, we specify the class (or function) that is going to use it. This makes the code more transparent and easier to debug .

The yaml file contains all the information to initialize the classes when loading them. In SpeechBrain we load it with a special function called load_hyperpyyaml , which initializes all the declared classes. This makes the code extremely readable and compact .

The experiment file (e.g., example_asr_ctc_experiment.py in the example) trains a model by combining the functions or classes declared in the yaml file . This script defines the data processing pipeline and defines all the computations from the input signal to the final cost function. Everything is designed to be easy to customize .

To make training easier, SpeechBrain includes the Brain class , which uses overloadable routines for training a model over multiple epochs, validation, checkpointing, and data loading. Our flexible DynamicItemDataset data loader class allows the data reading pipeline to be fully customized directly in the experiment file.

As a result, the code used for training is rather compact and organized in meaningful classes/functions with clear functionalities. Even for complex systems, you can run all the training experiments in all the recipes in this simple way. Right now, we have recipes for many speech datasets, including LibriSpeech , VoxCeleb , CommonVoice , AISHELL-1 , AMI , DNS , Google Speech Commands , SLURP , TIMIT , Voicebank , WSJ0Mix , Fluent Speech Commands , and Timers and Such .

Future Plans

We plan to progressively build a community working on this open-source toolkit. In the future, we would like to extend the functionalities of the toolkit to include tasks such as text-to-speech, self-supervised learning, models for small footprint devices, and support for real-time online speech processing. An important component in this ambitious growth plan will be played by the open-source community.

Other help can come from sponsors. Sponsoring allows us to keep expanding the SpeechBrain team and highly increasing the number of new features coming out. If you are interested to contribute, do not hesitate to contact us at [email protected] .

Acknowledgments

A special thank you to all of the contributors who made this project possible! This project would not have been possible without the generous contribution of our current industrial sponsors: Samsung, Nvidia, Dolby, Nuance, Via-Dialog.

Contributors

Mirco Ravanelli, Mila, University of Montréal (CA) Titouan Parcollet, Avignon Université (LIA, FR) Aku Rouhe, Aalto University (FI) Peter Plantinga, Ohio State University (USA) Elena Rastorgueva Loren Lugosch, Mila, McGill University (CA) Nauman Dawalatabad, Indian Institute of Technology Madras (IN) Ju-Chieh Chou, National Taiwan University (TW) Abdel Heba, Linagora / University of Toulouse (IRIT, FR) Francois Grondin, University of Sherbrooke (CA) William Aris, University of Sherbrooke (CA) Chien-Feng Liao, National Taiwan University (TW) Samuele Cornell, Università Politecnica delle Marche (IT) Sung-Lin Yeh, National Tsing Hua University (TW) Hwidong Na, Visiting Researcher Samsung SAIL (CA) Yan Gao, University of Cambridge (UK) Szu-Wei Fu, Academia Sinica (TW) Cem Subakan, Mila, University of Montréal (CA) Jianyuan Zhong, University of Rochester (USA) Brecht Desplanques, Ghent University (BE) Jenthe Thienpondt, Ghent University (BE) Salima Mdhaffar, Avignon Université (LIA, FR) Renato De Mori, University of McGill (CA), Avignon University (LIA, FR) Yoshua Bengio, Mila, University of Montréal (CA)

References:

[1] https://github.com/Theano/Theano [2] http://deeplearning.net/software/theano [3] https://pytorch.org/ [4] https://github.com/kaldi-asr/kaldi [5] https://github.com/espnet/espnet [6] https://github.com/NVIDIA/NeMo

Mirco Ravanelli

Loren Lugosch

speechbrain / metricgan-plus-voicebank like 45

Metricgan-trained model for enhancement.

This repository provides all the necessary tools to perform enhancement with SpeechBrain. For a better experience we encourage you to learn more about SpeechBrain . The model performance is:

Install SpeechBrain

First of all, please install SpeechBrain with the following command:

Please notice that we encourage you to read our tutorials and learn more about SpeechBrain .

Pretrained Usage

To use the mimic-loss-trained model for enhancement, use the following simple code:

The system is trained with recordings sampled at 16kHz (single channel). The code will automatically normalize your audio (i.e., resampling + mono channel selection) when calling enhance_file if needed. Make sure your input tensor is compliant with the expected sampling rate if you use enhance_batch as in the example.

Inference on GPU

To perform inference on the GPU, add run_opts={"device":"cuda"} when calling the from_hparams method.

The model was trained with SpeechBrain (d0accc8). To train it from scratch follows these steps:

Clone SpeechBrain:
Install it:
Run Training:

You can find our training results (models, logs, etc) here .

Limitations

The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.

Referencing MetricGAN+

If you find MetricGAN+ useful, please cite:

About SpeechBrain

Website: https://speechbrain.github.io/
Code: https://github.com/speechbrain/speechbrain/
HuggingFace: https://huggingface.co/speechbrain/

Citing SpeechBrain

Please, cite SpeechBrain if you use it for your research or business.

Spaces using speechbrain/metricgan-plus-voicebank 30

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

Notifications

The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.

speechbrain/speechbrain.github.io

Folders and files, contributors 11.

JavaScript 17.3%

Unraveling Predictive Mechanism in Speech Perception and Production: Insights from EEG Analyses of Brain Network Dynamics

揭示语音感知和产生的预测机制: 来自脑网络动力学的EEG探究

Published: 23 April 2024

Cite this article

Bin Zhao ( 赵彬 ) 1 ,
Jianwu Dang ( 党建武 ) 3 &
Aijun Li ( 李爱军 ) 1 , 2

How neural networks coordinate to support speech perception and speech production represents a forefront research topic in both contemporary neuroscience and artificial intelligence. Despite the successful incorporation of hierarchical and predictive attributes from biological neural networks (BNNs) into artificial counterparts, substantial disparities persist, particularly in terms of real-time feedback and nonlinear regulation. To gain a more profound understanding of how BNNs manifest these attributes, the present study employed electroencephalography (EEG) techniques to examine the spatiotemporal brain network dynamics involved in listening and oral reading of identical sentences. These two tasks engage distinct sensorimotor modalities while sharing high-level semantic and syntactic representations. According to a hierarchical feedforward model, the low-level auditory and visual inputs would be progressively transformed towards abstract representations of the sentence meaning, leading to a convergence of brain network patterns in higher cognitive regions. However, our findings challenged this viewpoint by revealing an early resemblance of network activation in the prefrontal and parietal areas in both tasks. It implies a top-down predictive mechanism along with the bottom-up progression. This bidirectional interaction could be potentially implemented through frequency-specific synchronization and desynchronization between functional-specific cortical regions, laying the foundation of the speech chain system with common neural substrates.

神经网络如何协调支持语音感知和语音产生是当代神经科学和人工智能的前沿研究课题。尽管人工神经网络已成功地整合了生物神经网络的层次性和预测性, 但两者之间实质性的差异仍然存在, 特别是在实时反馈和非线性调节方面。为了更深入地了解生物神经网络如何表现这些属性, 本研究采用脑电技术研究了听力和口语阅读任务中的脑网络时空动态特性。这两个任务涉及不同的感觉运动模态, 但共享高层级的语义和句法表征。根据层级前馈模型, 低层级的听觉和视觉输入将逐步转化为句子意义的抽象表征, 导致大脑网络模式在更高的认知区域趋同。然而, 我们的研究结果揭示了与这一观点相悖的现象, 即两个任务中前额叶和顶叶区域的网络激活的早期相似性, 它意味着自上而下的预测机制和自下而上的同步展开。这种双向交互作用可能通过特定功能皮质区域之间频率特异性的同步和去同步来实现, 为具有共同神经基质的言语链奠定了神经生理学方面的基础。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

HOHENSTEIN J, KIZILCEC R F, DIFRANZO D, et al. Artificial intelligence in communication impacts language and social relationships [J]. Scientific Reports , 2023, 13 (1): 5487.

Article Google Scholar

SHEWALKAR A, NYAVANANDI D, LUDWIG S A. Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU [J]. Journal of Artificial Intelligence and Soft Computing Research , 2019, 9 (4): 235–245.

SCHRIMPF M, BLANK I A, TUCKUTE G, et al. The neural architecture of language: Integrative modeling converges on predictive processing [J]. Proceedings of the National Academy of Sciences of the United States of America , 2021, 118 (45): e2105646118.

BEIER E J, CHANTAVARIN S, REHRIG G, et al. Cortical tracking of speech: Toward collaboration between the fields of signal and sentence processing [J]. Journal of Cognitive Neuroscience , 2021, 33 (4): 574–593.

VIGNEAU M, BEAUCOUSIN V, HERVÉ P Y, et al. Meta-analyzing left hemisphere language areas: Phonology, semantics, and sentence processing [J]. NeuroImage , 2006, 30 (4): 1414–1432.

ZHU Y M, XU M, LU J F, et al. Distinct spatiotemporal patterns of syntactic and semantic processing in human inferior frontal gyrus [J]. Nature Human Behaviour , 2022, 6 : 1104–1111.

HAMILTON L S, OGANIAN Y, HALL J, et al. Parallel and distributed encoding of speech across human auditory cortex [J]. Cell , 2021, 184 (18): 4626–4639.e13.

APŠVALKA D, FERREIRA C S, SCHMITZ T W, et al. Dynamic targeting enables domain-general inhibitory control over action and thought by the prefrontal cortex [J]. Nature Communications , 2022, 13 : 274.

BINDER J R, DESAI R H, GRAVES W W, et al. Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies [J]. Cerebral Cortex , 2009, 19 (12): 2767–2796.

WALENSKI M, EUROPA E, CAPLAN D, et al. Neural networks for sentence comprehension and production: An ALE-based meta-analysis of neuroimaging studies [J]. Human brain mapping , 2019, 40 (8): 2275–2304.

HICKOK G, POEPPEL D. Dorsal and ventral streams: A framework for understanding aspects of the functional anatomy of language [J]. Cognition , 2004, 92 (1/2): 67–99.

FRIDRIKSSON J, YOURGANOV G, BONILHA L, et al. Revealing the dual streams of speech processing [J]. Proceedings of the National Academy of Sciences of the United States of America , 2016, 113 (52): 15108–15113.

FRIEDERICI A D, RÜSCHEMEYER S A, HAHNE A, et al. The role of left inferior frontal and superior temporal cortex in sentence comprehension: Localizing syntactic and semantic processes [J]. Cerebral Cortex , 2003, 13 (2): 170–177.

TOURVILLE J A, REILLY K J, GUENTHER F H. Neural mechanisms underlying auditory feedback control of speech [J]. NeuroImage , 2008, 39 (3): 1429–1443.

BERENT I, PLATT M, THEODORE R, et al. Speech perception triggers articulatory action: Evidence from mechanical stimulation [J]. Frontiers in Communication , 2020, 5 : 34.

LIBERMAN A M, MATTINGLY I G. The motor theory of speech perception revised [J]. Cognition , 1985, 21 (1): 1–36.

JUNG T P, MAKEIG S, BELL A J, et al. Independent component analysis of electroencephalographic and event-related potential data [M]//Central auditory processing and neural modeling. Boston: Springer, 1998: 189–197.

Chapter Google Scholar

MULLEN T, DELORME A, KOTHE C, et al. An electrophysiological information flow toolbox for EEGLAB [J]. Biological Cybernetics , 2010, 83 : 35–45.

Google Scholar

MAKEIG S, DEBENER S, ONTON J, et al. Mining event-related brain dynamics [J]. Trends in Cognitive Sciences , 2004, 8 (5): 204–210.

KRIEGESKORTE N, MUR M, BANDETTINI P. Representational similarity analysis-connecting the branches of systems neuroscience [J]. Wellcome Open Research , 2008, 2 : 4.

DELORME A, MAKEIG S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis [J]. Journal of Neuroscience Methods , 2004, 134 (1): 9–21.

MULLEN T R, KOTHE C A E, CHI Y M, et al. Real-time neuroimaging and cognitive monitoring using wearable dry EEG [J]. IEEE Transactions on Biomedical Engineering , 2015, 62 (11): 2553–2567.

HSU S H, PION-TONACHINI L, PALMER J, et al. Modeling brain dynamic state changes with adaptive mixture independent component analysis [J]. NeuroImage , 2018, 183 : 47–61.

OOSTENVELD R, OOSTENDORP T F. Validating the boundary element method for forward and inverse EEG computations in the presence of a hole in the skull [J]. Human Brain Mapping , 2002, 17 (3): 179–192.

PION-TONACHINI L, KREUTZ-DELGADO K, MAKEIG S. ICLabel: An automated electroen-cephalographic independent component classifier, dataset, and website [J]. NeuroImage , 2019, 198 : 181–197.

DELORME A, MULLEN T, KOTHE C, et al. EEGLAB, SIFT, NFT, BCILAB, and ERICA: New tools for advanced EEG processing [J]. Computational Intelligence and Neuroscience , 2011, 2011 : 10.

SCHELTER B, WINTERHALDER M, EICHLER M, et al. Testing for directed influences among neural signals using partial directed coherence [J]. Journal of Neuroscience Methods , 2006, 152 (1/2): 210–219.

BONHAGE C E, MEYER L, GRUBER T, et al. Oscillatory EEG dynamics underlying automatic chunking during sentence processing [J]. NeuroImage , 2017, 152 : 647–657.

VON STEIN A, SARNTHEIN J. Different frequencies for different scales of cortical integration: From local gamma to long range alpha/theta synchronization [J]. International Journal of Psychophysiology , 2000, 38 (3): 301–313.

PALVA S, PALVA J M. New vistas for α-frequency band oscillations [J]. Trends in Neurosciences , 2007, 30 (4): 150–158.

CUELLAR M, BOWERS A, HARKRIDER A W, et al. Mu suppression as an index of sensorimotor contributions to speech processing: Evidence from continuous EEG signals [J]. International Journal of Psychophysiology , 2012, 85 (2): 242–248.

KOECHLIN E, ODY C, KOUNEIHER F. The architecture of cognitive control in the human prefrontal cortex [J]. Science , 2003, 302 (5648): 1181–1185.

HELFRICH R F, KNIGHT R T. Oscillatory dynamics of prefrontal cognitive control [J]. Trends in Cognitive Sciences , 2016, 20 (12): 916–930.

GERANMAYEH F, WISE R J S, MEHTA A, et al. Overlapping networks engaged during spoken language production and its cognitive control [J]. Journal of Neuroscience , 2014, 34 (26): 8728–8740.

BABILONI C, DEL PERCIO C, VECCHIO F, et al. Alpha, beta and gamma electrocorticographic rhythms in somatosensory, motor, premotor and prefrontal cortical areas differ in movement execution and observation in humans [J]. Clinical Neurophysiology , 2016, 127 (1): 641–654.

LIU C, HAN T, XU Z, et al. Modulating gamma oscillations promotes brain connectivity to improve cognitive impairment [J]. Cerebral Cortex , 2022, 32 (12): 2644–2656.

Download references

Author information

Authors and affiliations.

Key Laboratory of Linguistics, Chinese Academy of Social Sciences, Beijing, 100732, China

Bin Zhao ( 赵彬 ) & Aijun Li ( 李爱军 )

Corpus and Computational Linguistics Center, Institute of Linguistics, Chinese Academy of Social Sciences, Beijing, 100732, China

Aijun Li ( 李爱军 )

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, 518055, China

Jianwu Dang ( 党建武 )

You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Zhao ( 赵彬 ) .

Ethics declarations

Conflict of Interest The authors declare that they have no conflict of interest.

Additional information

Foundation item: the Key Laboratory of Linguistics, Chinese Academy of Social Sciences (No. 2024SYZH001), and the National Natural Science Foundation of China (No. 62276185)

Rights and permissions

Reprints and permissions

About this article

Zhao, B., Dang, J. & Li, A. Unraveling Predictive Mechanism in Speech Perception and Production: Insights from EEG Analyses of Brain Network Dynamics. J. Shanghai Jiaotong Univ. (Sci.) (2024). https://doi.org/10.1007/s12204-024-2729-9

Download citation

Received : 19 December 2023

Accepted : 05 January 2024

Published : 23 April 2024

DOI : https://doi.org/10.1007/s12204-024-2729-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

speech perception and production
electroencephalography (EEG) techniques
brain network dynamics
predictive coding
frequency multiplexing

Document code

Find a journal
Publish with us
Track your research

Share full article

For more audio journalism and storytelling, download New York Times Audio , a new iOS app available for news subscribers.

The Evolving Danger of the New Bird Flu

An unusual outbreak of the disease has spread to dairy herds in multiple u.s. states..

This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors. Please review the episode audio before quoting from this transcript and email [email protected] with any questions.

From “The New York Times,” I’m Sabrina Tavernise, and this is “The Daily.”

[MUSIC PLAYING]

The outbreak of bird flu that is tearing through the nation’s poultry farms is the worst in US history. But scientists say it’s now starting to spread into places and species it’s never been before.

Today, my colleague, Emily Anthes, explains.

It’s Monday, April 22.

Emily, welcome back to the show.

Thanks for having me. Happy to be here.

So, Emily, we’ve been talking here on “The Daily” about prices of things and how they’ve gotten so high, mostly in the context of inflation episodes. And one of the items that keeps coming up is eggs. Egg prices were through the roof last year, and we learned it was related to this. Avian flu has been surging in the United States. You’ve been covering this. Tell us what’s happening.

Yes, so I have been covering this virus for the last few years. And the bird flu is absolutely tearing through poultry flocks, and that is affecting egg prices. That’s a concern for everyone, for me and for my family. But when it comes to scientists, egg prices are pretty low on their list of concerns. Because they see this bird flu virus behaving differently than previous versions have. And they’re getting nervous, in particular, about the fact that this virus is reaching places and species where it’s never been before.

OK, so bird flu, though, isn’t new. I mean I remember hearing about cases in Asia in the ‘90s. Remind us how it began.

Bird flu refers to a bunch of different viruses that are adapted to spread best in birds. Wild water birds, in particular, are known for carrying these viruses. And flu viruses are famous for also being shapeshifters. So they’re constantly swapping genes around and evolving into new strains. And as you mentioned back in the ‘90s, a new version of bird flu, a virus known as H5N1, emerged in Asia. And it has been spreading on and off around the world since then, causing periodic outbreaks.

And how are these outbreaks caused?

So wild birds are the reservoir for the virus, which means they carry it in their bodies with them around the world as they fly and travel and migrate. And most of the time, these wild birds, like ducks and geese, don’t even get very sick from this virus. But they shed it. So as they’re traveling over a poultry farm maybe, if they happen to go to the bathroom in a pond that the chickens on the farm are using or eat some of the feed that chickens on the farm are eating, they can leave the virus behind.

And the virus can get into chickens. In some cases, it causes mild illness. It’s what’s known as low pathogenic avian influenza. But sometimes the virus mutates and evolves, and it can become extremely contagious and extremely fatal in poultry.

OK, so the virus comes through wild birds, but gets into farms like this, as you’re describing. How have farms traditionally handled outbreaks, when they do happen?

Well, because this threat isn’t new, there is a pretty well-established playbook for containing outbreaks. It’s sometimes known as stamping out. And brutally, what it means is killing the birds. So the virus is so deadly in this highly pathogenic form that it’s sort of destined to kill all the birds on a farm anyway once it gets in. So the response has traditionally been to proactively depopulate or cull all the birds, so it doesn’t have a chance to spread.

So that’s pretty costly for farmers.

It is. Although the US has a program where it will reimburse farmers for their losses. And the way these reimbursements work is they will reimburse farmers only for the birds that are proactively culled, and not for those who die naturally from the virus. And the thinking behind that is it’s a way to incentivize farmers to report outbreaks early.

So, OK, lots of chickens are killed in a way to manage these outbreaks. So we know how to deal with them. But what about now? Tell me about this new strain.

So this new version of the virus, it emerged in 2020.

After the deadly outbreak of the novel coronavirus, authorities have now confirmed an outbreak of the H5N1 strain of influenza, a kind of bird flu.

And pretty quickly it became clear that a couple things set it apart.

A bald eagle found dead at Carvins Cove has tested positive for the highly contagious bird flu.

This virus, for whatever reason, seemed very good at infecting all sorts of wild birds that we don’t normally associate with bird flu.

[BIRD CRYING]

He was kind of stepping, and then falling over, and using its wing to right itself.

Things like eagles and condors and pelicans.

We just lost a parliament of owls in Minneapolis.

Yeah, a couple of high profile nests.

And also in the past, wild birds have not traditionally gotten very sick from this virus. And this version of the virus not only spread widely through the wild bird population, but it proved to be devastating.

The washing up along the East Coast of the country from Scotland down to Suffolk.

We were hearing about mass die-offs of seabirds in Europe by the hundreds and the thousands.

And the bodies of the dead dot the island wherever you look.

Wow. OK. So then as we know, this strain, like previous ones, makes its way from wild animals to farmed animals, namely to chickens. But it’s even more deadly.

Absolutely. And in fact, it has already caused the worst bird flu outbreak in US history. So more than 90 million birds in the US have died as a result of this virus.

90 million birds.

Yes, and I should be clear that represents two things. So some of those birds are birds who naturally got infected and died from the virus. But the vast majority of them are birds that were proactively culled. What it adds up to is, is 90 million farmed birds in the US have died since this virus emerged. And it’s not just a chicken problem. Another thing that has been weird about this virus is it has jumped into other kinds of farms. It is the first time we’ve seen a bird flu virus jump into US livestock.

And it’s now been reported on a number of dairy farms across eight US states. And that’s just something that’s totally unprecedented.

So it’s showing up at Dairy farms now. You’re saying that bird flu has now spread to cows. How did that happen?

So we don’t know exactly how cows were first infected, but most scientists’ best guess is that maybe an infected wild bird that was migrating shed the virus into some cattle feed or a pasture or a pond, and cattle picked it up. The good news is they don’t seem to get nearly as sick as chickens do. They are generally making full recoveries on their own in a couple of weeks.

OK, so no mass culling of cows?

No, that doesn’t seem to be necessary at this point. But the bad news is that it’s starting to look like we’re seeing this virus spread from cow to cow. We don’t know exactly how that’s happening yet. But anytime you see cow-to-cow or mammal-to-mammal transmission, that’s a big concern.

And why is that exactly?

Well, there are a bunch of reasons. First, it could allow the outbreak to get much bigger, much faster, which might increase the risk to the food supply. And we might also expect it to increase the risk to farm workers, people who might be in contact with these sick cows.

Right now, the likelihood that a farmer who gets this virus passes it on is pretty low. But any time you see mammal-to-mammal transmission, it increases the chance that the virus will adapt and possibly, maybe one day get good at spreading between humans. To be clear, that’s not something that there’s any evidence happening in cows right now. But the fact that there’s any cow-to-cow transmission happening at all is enough to have scientists a bit concerned.

And then if we think more expansively beyond what’s happening on farms, there’s another big danger lurking out there. And that’s what happens when this virus gets into wild animals, vast populations that we can’t control.

We’ll be right back.

So, Emily, you said that another threat was the threat of flu in wild animal populations. Clearly, of course, it’s already in wild birds. Where else has it gone?

Well, the reason it’s become such a threat is because of how widespread it’s become in wild birds. So they keep reintroducing it to wild animal populations pretty much anywhere they go. So we’ve seen the virus repeatedly pop up in all sorts of animals that you might figure would eat a wild bird, so foxes, bobcats, bears. We actually saw it in a polar bear, raccoons. So a lot of carnivores and scavengers.

The thinking is that these animals might stumble across a sick or dead bird, eat it, and contract the virus that way. But we’re also seeing it show up in some more surprising places, too. We’ve seen the virus in a bottle-nosed dolphin, of all places.

And most devastatingly, we’ve seen enormous outbreaks in other sorts of marine mammals, especially sea lions and seals.

So elephant seals, in particular in South America, were just devastated by this virus last fall. My colleague Apoorva Mandavilli and I were talking to some scientists in South America who described to us what they called a scene from hell, of walking out onto a beach in Argentina that is normally crowded with chaotic, living, breathing, breeding, elephant seals — and the beach just being covered by carcass, after carcass, after carcass.

Mostly carcasses of young newborn pups. The virus seemed to have a mortality rate of 95 percent in these elephant seal pups, and they estimated that it might have killed more than 17,000 of the pups that were born last year. So almost the entire new generation of this colony. These are scientists that have studied these seals for decades. And they said they’ve never seen anything like it before.

And why is it so far reaching, Emily? I mean, what explains these mass die-offs?

There are probably a few explanations. One is just how much virus is out there in the environment being shed by wild birds into water and onto beaches. These are also places that viruses like this haven’t been before. So it’s reaching elephant seals and sea lions in South America that have no prior immunity.

There’s also the fact that these particular species, these sea lions and seals, tend to breed in these huge colonies all crowded together on beaches. And so what that means is if a virus makes its way into the colony, it’s very conducive conditions for it to spread. And scientists think that that’s actually what’s happening now. That it’s not just that all these seals are picking up the virus from individual birds, but that they’re actually passing it to each other.

So basically, this virus is spreading to places it’s never been before, kind of virgin snow territory, where animals just don’t have the immunity against it. And once it gets into a population packed on a beach, say, of elephant seals, it’s just like a knife through butter.

Absolutely. And an even more extreme example of that is what we’re starting to see happen in Antarctica, where there’s never been a bird flu outbreak before until last fall, for the first time, this virus reached the Antarctic mainland. And we are now seeing the virus move through colonies of not only seabirds and seals, but penguin colonies, which have not been exposed to these viruses before.

And it’s too soon to say what the toll will be. But penguins also, of course, are known for breeding in these large colonies.

Probably. don’t have many immune defenses against this virus, and of course, are facing all these other environmental threats. And so there’s a lot of fear that you add on the stress of a bird flu virus, and it could just be a tipping point for penguins.

Emily, at this point, I’m kind of wondering why more people aren’t talking about this. I mean, I didn’t know any of this before having this conversation with you, and it feels pretty worrying.

Well, a lot of experts and scientists are talking about this with rising alarm and in terms that are quite stark. They’re talking about the virus spreading through wild animal populations so quickly and so ferociously that they’re calling it an ecological disaster.

But that’s a disaster that sometimes seems distant from us, both geographically, we’re talking about things that are happening maybe at the tip of Argentina or in Antarctica. And also from our concerns of our everyday lives, what’s happening in Penguins might not seem like it has a lot to do with the price of a carton of eggs at the grocery store. But I think that we should be paying a lot of attention to how this virus is moving through animal populations, how quickly it’s moving through animal populations, and the opportunities that it is giving the virus to evolve into something that poses a much bigger threat to human health.

So the way it’s spreading in wild animals, even in remote places like Antarctica, that’s important to watch, at least in part because there’s a real danger to people here.

So we know that the virus can infect humans, and that generally it’s not very good at spreading between humans. But the concern all along has been that if this virus has more opportunities to spread between mammals, it will get better at spreading between them. And that seems to be what is happening in seals and sea lions. Scientists are already seeing evidence that the virus is adapting as it passes from marine mammal to marine mammal. And that could turn it into a virus that’s also better at spreading between people.

And if somebody walks out onto a beach and touches a dead sea lion, if their dog starts playing with a sea lion carcass, you could imagine that this virus could make its way out of marine mammals and into the human population. And if it’s this mammalian adapted version of the virus that makes its way out, that could be a bigger threat to human health.

So the sheer number of hosts that this disease has, the more opportunity it has to mutate, and the more chance it has to mutate in a way that would actually be dangerous for people.

Yes, and in particular, the more mammalian hosts. So that gives the virus many more opportunities to become a specialist in mammals instead of a specialist in birds, which is what it is right now.

Right. I like that, a specialist in mammals. So what can we do to contain this virus?

Well, scientists are exploring new options. There’s been a lot of discussion about whether we should start vaccinating chickens in the US. The government, USDA labs, have been testing some poultry vaccines. It’s probably scientifically feasible. There are challenges there, both in terms of logistics — just how would you go about vaccinating billions of chickens every year. There are also trade questions. Traditionally, a lot of countries have not been willing to accept poultry products from countries that vaccinate their poultry.

And there’s concern about whether the virus might spread undetected in flocks that are vaccinated. So as we saw with COVID, the vaccine can sometimes stop you from getting sick, but it doesn’t necessarily stop infection. And so countries are worried they might unknowingly import products that are harboring the virus.

And what about among wild animals? I mean, how do you even begin to get your head around that?

Yeah, I mean, thinking about vaccinating wild animals maybe makes vaccinating all the chickens in the US look easy. There has been some discussion of limited vaccination campaigns, but that’s not feasible on a global scale. So unfortunately, the bottom line is there isn’t a good way to stop spread in wild animals. We can try to protect some vulnerable populations, but we’re not going to stop the circulation of this virus.

So, Emily, we started this conversation with a kind of curiosity that “The Daily” had about the price of eggs. And then you explained the bird flu to us. And then somehow we ended up learning about an ecological disaster that’s unfolding all around us, and potentially the source of the next human pandemic. That is pretty scary.

It is scary, and it’s easy to get overwhelmed by it. And I feel like I should take a step back and say none of this is inevitable. None of this is necessarily happening tomorrow. But this is why scientists are concerned and why they think it’s really important to keep a very close eye on what’s happening both on farms and off farms, as this virus spreads through all sorts of animal populations.

One thing that comes up again and again and again in my interviews with people who have been studying bird flu for decades, is how this virus never stops surprising them. And sometimes those are bad surprises, like these elephant seal die-offs, the incursions into dairy cattle. But there are some encouraging signs that have emerged recently. We’re starting to see some early evidence that some of the bird populations that survived early brushes with this virus might be developing some immunity. So that’s something that maybe could help slow the spread of this virus in animal populations.

We just don’t entirely know how this is going to play out. Flu is a very difficult, wily foe. And so that’s one reason scientists are trying to keep such a close, attentive eye on what’s happening.

Emily, thank you.

Thanks for having me.

Here’s what else you should know today.

On this vote, the yeas are 366 and the nays are 58. The bill is passed.

On Saturday, in four back-to-back votes, the House voted resoundingly to approve a long-stalled package of aid to Ukraine, Israel and other American allies, delivering a major victory to President Biden, who made aid to Ukraine one of his top priorities.

On this vote, the yeas are 385, and the no’s are 34 with one answering present. The bill is passed without objection.

The House passed the component parts of the $95 billion package, which included a bill that could result in a nationwide ban of TikTok.

On this vote, the yeas are 311 and the nays are 112. The bill is passed.

Oh, one voting present. I missed it, but thank you.

In a remarkable breach of custom, Democrats stepped in to supply the crucial votes to push the legislation past hard-line Republican opposition and bring it to the floor.

The House will be in order.

The Senate is expected to pass the legislation as early as Tuesday.

Today’s episode was produced by Rikki Novetsky, Nina Feldman, Eric Krupke, and Alex Stern. It was edited by Lisa Chow and Patricia Willens; contains original music by Marion Lozano, Dan Powell, Rowan Niemisto, and Sophia Lanman; and was engineered by Chris Wood. Our theme music is by Jim Brunberg and Ben Landsverk of Wonderly. Special thanks to Andrew Jacobs.

That’s it for “The Daily.” I’m Sabrina Tavernise. See you tomorrow.

April 25, 2024 • 40:33 The Crackdown on Student Protesters
April 24, 2024 • 32:18 Is $60 Billion Enough to Save Ukraine?
April 23, 2024 • 30:30 A Salacious Conspiracy or Just 34 Pieces of Paper?
April 22, 2024 • 24:30 The Evolving Danger of the New Bird Flu
April 19, 2024 • 30:42 The Supreme Court Takes Up Homelessness
April 18, 2024 • 30:07 The Opening Days of Trump’s First Criminal Trial
April 17, 2024 • 24:52 Are ‘Forever Chemicals’ a Forever Problem?
April 16, 2024 • 29:29 A.I.’s Original Sin
April 15, 2024 • 24:07 Iran’s Unprecedented Attack on Israel
April 14, 2024 • 46:17 The Sunday Read: ‘What I Saw Working at The National Enquirer During Donald Trump’s Rise’
April 12, 2024 • 34:23 How One Family Lost $900,000 in a Timeshare Scam
April 11, 2024 • 28:39 The Staggering Success of Trump’s Trial Delay Tactics

Hosted by Sabrina Tavernise

Produced by Rikki Novetsky , Nina Feldman , Eric Krupke and Alex Stern

Edited by Lisa Chow and Patricia Willens

Original music by Marion Lozano , Dan Powell , Rowan Niemisto and Sophia Lanman

Engineered by Chris Wood

Listen and follow The Daily Apple Podcasts | Spotify | Amazon Music

The outbreak of bird flu currently tearing through the nation’s poultry is the worst in U.S. history. Scientists say it is now spreading beyond farms into places and species it has never been before.

Emily Anthes, a science reporter for The Times, explains.

On today’s episode

Emily Anthes , a science reporter for The New York Times.

Two dead pelicans are pictured from above lying on the shore where the water meets a rocky beach.

Background reading

Scientists have faulted the federal response to bird flu outbreaks on dairy farms .

Here’s what to know about the outbreak.

There are a lot of ways to listen to The Daily. Here’s how.

We aim to make transcripts available the next workday after an episode’s publication. You can find them at the top of the page.

Special thanks to Andrew Jacobs .

The Daily is made by Rachel Quester, Lynsea Garrison, Clare Toeniskoetter, Paige Cowett, Michael Simon Johnson, Brad Fisher, Chris Wood, Jessica Cheung, Stella Tan, Alexandra Leigh Young, Lisa Chow, Eric Krupke, Marc Georges, Luke Vander Ploeg, M.J. Davis Lin, Dan Powell, Sydney Harper, Mike Benoist, Liz O. Baylen, Asthaa Chaturvedi, Rachelle Bonja, Diana Nguyen, Marion Lozano, Corey Schreppel, Rob Szypko, Elisheba Ittoop, Mooj Zadie, Patricia Willens, Rowan Niemisto, Jody Becker, Rikki Novetsky, John Ketchum, Nina Feldman, Will Reid, Carlos Prieto, Ben Calhoun, Susan Lee, Lexie Diao, Mary Wilson, Alex Stern, Dan Farrell, Sophia Lanman, Shannon Lin, Diane Wong, Devon Taylor, Alyssa Moxley, Summer Thomad, Olivia Natt, Daniel Ramirez and Brendan Klinkenberg.

Our theme music is by Jim Brunberg and Ben Landsverk of Wonderly. Special thanks to Sam Dolnick, Paula Szuchman, Lisa Tobin, Larissa Anderson, Julia Simon, Sofia Milan, Mahima Chablani, Elizabeth Davis-Moorer, Jeffrey Miranda, Renan Borelli, Maddy Masiello, Isabella Anderson and Nina Lassam.

SpeechBrain

A PyTorch Powered Speech Toolkit

Key Features

SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains.

Speech Recognition

Speaker Recognition

Speech Enhancement

Speech Processing

SpeechBrain provides efficient and GPU-friendly speech augmentation pipelines and acoustic features extraction, normalisation that can be used on-the-fly during your experiment.

Multi Microphone Processing

Research & Development

HuggingFace!

Why SpeechBrain?

Easy to install
Easy to use
Easy to customize

Adapts to your needs.

SpeechBrain thanks its generous sponsors. Sponsoring allows us to expand the team and further extend the functionalities of the toolkit.

Collaborators

Brain-computer interfaces for handwriting, speech and beyond

Abstract: Locked-in syndrome, caused by brainstem stroke or ALS, can leave an individual “trapped” inside of their own body and unable to move or talk. Brain-computer interfaces (BCIs) that detect the desire to move directly from the brain can restore communication via thought alone. My work has developed handwriting and speech BCIs that set new performance records, showing for the first time that the intention to talk or write can be neurally decoded at rapid speeds in people who have not done so for years. This unprecedented access to neural activity in the human brain has also led to new fundamental neuroscience insights – for example, that the whole body is represented in each part of the motor cortex, contradicting the traditional “homunculus” model. My future work will seek to expand the scope of what BCIs can do to disorders such as Broca’s aphasia (loss of speech/language production due to stroke), paving the way for BCIs to be a routine approach to solving otherwise intractable neurological disorders.

Was Biden's uncle eaten by cannibals near New Guinea in World War II? Here's what the president said.

WASHINGTON ― Was President Joe Biden 's uncle eaten by cannibals? That appears to be what he suggested − twice − this week when he said the remains of his uncle, a military veteran who died during World War II in a plane crash off the New Guinea coast, were not recovered.

Biden's telling differed from an account published by the Defense POW/MIA Accounting Agency , which says Biden's uncle, Ambrose Finnegan, and two other men "failed to emerge from the sinking wreck and were lost in the crash."

Biden discussed the 1944 death of Finnegan, a lieutenant in the U.S. Army Air Forces, after visiting a war memorial where Finnegan is honored in his hometown of Scranton, Pa. on Wednesday morning. He recounted the same story during remarks at the United Steelworkers union's headquarters in Pittsburgh later in the afternoon.

Biden's Uncle Bosie, 'a hell of an athlete'

"And my uncle − they called him Ambrose. Instead of 'Brosie,' they called him 'Bosie,' Biden said. "My Uncle Bosie was a hell of an athlete, they tell me, when he was a kid. And he became an Army Air Corps, before the Air Force came along. He flew those single-engine planes as reconnaissance over war zones."

Prep for the polls: See who is running for president and compare where they stand on key issues in our Voter Guide

"And he got shot down in New Guinea, and they never found the body because there used to be − there were a lot of cannibals − for real − in that part of New Guinea," Biden continued.

The account from the DPAA says a crew of three men and one passenger, Finnegan, on May 14, 1944 left on an A-20 havoc from Momote Airfield, Los Negros Island, en route to Nadzab Airfield, New Guinea.

"For unknown reasons, this plane was forced to ditch in the ocean off the north coast of New Guinea. Both engines failed at low altitude, and the aircraft's nose hit the water hard," the account says.

In addition to the three men lost in the crash, one crew member survived and was saved by a passing barge, the DPPA report says, adding that, "An aerial search the next day found no trace of the missing aircraft or the lost crew members."

Finnegan "has not been associated with any remains recovered from the area after the war and is still unaccounted-for," according to the DPPA.

Biden brought up the circumstances of Finnegan's death while discussing how former President Donald Trump, the presumptive Republican frontrunner, reportedly disparaged American soldiers killed in combat as "suckers" and "losers" while president, according to military officials who worked for Trump. Trump has denied the allegations.

"I'm not making that up. His staff who was with him acknowledge it today," Biden said. in Pittsburgh. "'Suckers' and 'losers.' That man doesn’t deserve to have been the commander-in-chief for my son, my uncle."

Biden's oldest son, Beau Biden, was an Iraq War veteran who died of brain cancer in 2015.

A politician's history of embellishment

Biden has a long history of embellishing stories on a number of subjects − whether it's being arrested during civil rights protests, which the New York Times reported there's no evidence of, the scale of a past kitchen fire at his Delaware home, or an oft-told exchange he had with an Amtrak conductor, who CNN reported was dead when Biden said the conversation occurred.

White House press secretary Karine Jean-Pierre did not defend the accuracy of Biden's cannibalism account when asked Thursday whether Biden embellished the story.

"The president highlighted his uncle's story as he made the case for honoring our sacred commitment to equip those we send to war and to take care of them and their families when they come home," she said. "And as he reiterated, the last thing American veterans are are 'suckers' or losers.'"

Reach Joey Garrison on X @joeygarrison.

Post-pandemic, US cardiovascular death rate continues upward trajectory

Medicine.net March 23,2024

Helping aspiring clinicians understand a virtual heart before they work with a real one

Study of heart attack patients finds definitive benefit of cardiac rehab.

Home (current)
Conference (current)
Job (current)
Education (current)
Public (current)
Digital (current)

Researchers illustrate how caregiver speech shapes the infant brain

by Stephen Fontenot, University of Texas at Dallas

Researchers illustrate how caregiver speech shapes infant brain

Glass brain image with a priori tracts of interest. Credit: Developmental Cognitive Neuroscience (2023). DOI: 10.1016/j.dcn.2023.101240

A team led by a University of Texas at Dallas neurodevelopment researcher has uncovered some of the most conclusive evidence yet that parents who talk more to their infants improve their babies' brain development.

The researchers used MRI and audio recordings to demonstrate that caregiver speech is associated with infant brain development in ways that improve long-term language progress. Dr. Meghan Swanson, assistant professor of psychology in the School of Behavioral and Brain Sciences, is corresponding author of the study, which was published online April 11 and in the June print edition of Developmental Cognitive Neuroscience .

"This paper is a step toward understanding why children who hear more words go on to have better language skills and what process facilitates that mechanism," Swanson said. "Ours is one of two new papers that are the first to show links between caregiver speech and how the brain's white matter develops."

White matter in the brain facilitates communication between various gray matter regions, where information processing takes place in the brain.

The research included 52 infants from the Infant Brain Imaging Study (IBIS), a National Institutes of Health Autism Center of Excellence project involving eight universities in the U.S. and Canada and clinical sites in Seattle, Philadelphia, St. Louis, Minneapolis, and Chapel Hill, North Carolina. Home language recordings were collected when children were 9 months old and again six months later, and MRIs were performed at 3 months old and 6 months old, and at ages 1 and 2.

"This timing of home recordings was chosen because it straddles the emergence of words," Swanson said. "We wanted to capture both this prelinguistic, babbling time frame, as well as a point after or near the emergence of talking."

It's long been known that an infant's home environment—especially the quality of caregiver speech—directly influences language acquisition, but the mechanisms behind this are unclear. Swanson's team imaged several areas of the brain's white matter, focusing on developing neurological pathways.

"The arcuate fasciculus is the fiber tract that everyone in neurobiology courses learns is essential to producing and understanding language, but that finding is based on adult brains," Swanson said. "In these children, we looked at other potentially meaningful fiber tracts as well, including the uncinate fasciculus, which has been linked to learning and memory."

The researchers used the images to measure fractional anisotropy (FA). This metric for the freedom or restriction of water movement in the brain is used as a proxy for the progress of white matter development.

"As a fiber track matures, water movement becomes more restricted, and the brain's structure becomes more coherent," Swanson said. "Because babies aren't born with highly specialized brains, one might expect that networks that support a given cognitive skill start out more diffuse and then become more specialized."

Swanson's team found that infants who heard more words had lower FA values, indicating that the structure of their white matter was slower to develop. The children went on to have better linguistic performance when they began to talk.

The study's results align with other recent research showing that slower maturation of white matter confers a cognitive advantage.

"As a brain matures, it becomes less plastic—networks get set in place. But from a neurobiological standpoint, infancy is unlike any other time. An infant brain seems to rely on a prolonged period of plasticity to learn certain skills," Swanson said. "The results show a clear, striking negative association between FA and child vocalization."

Sharnya Govindaraj, co-first author of the paper, a cognition and neuroscience doctoral student and a member of Swanson's Baby Brain Lab, said at first she was surprised by the results.

"We initially didn't know how to interpret these negative associations that seemed very counterintuitive. The whole concept of neuroplasticity and absorbing new knowledge had to fall into place," she said. "Which ability we're looking at also matters a great deal, because something like vision matures much earlier than language."

As the parent of a toddler in a bilingual household, Swanson was curious about how this relationship functions for infants exposed to more than one language.

"Raising a bilingual child, it is remarkable how she is not confused by languages, and she knows who she can use which language with," Swanson said.

Swanson said she also has gained a deeper level of appreciation and gratitude for what she, as a researcher, asks parents in her studies to do.

"When participants sign up, I'm asking them to commit to a year and a half," she said. "Because of the commitment of all the parents in prior studies, I and others have the knowledge that allows us to communicate with our children in a way that supports their development."

Swanson said the take-home message is that parents have the power to help their children develop.

"This work highlights parents as change agents in their children's lives, with the potential to have enormous protective effects," Swanson said. "I hope our work empowers parents with the knowledge and skills to support their children as best they can."

More information: Katiana A. Estrada et al, Language exposure during infancy is negatively associated with white matter microstructure in the arcuate fasciculus, Developmental Cognitive Neuroscience (2023). DOI: 10.1016/j.dcn.2023.101240

Provided by University of Texas at Dallas

Prev > Researchers issue a warning that GLP-1RAs may be dangerous for children
Next > Researcher shines light on effectiveness of school sunscreen legislation

Last Comments

Post comments

IMAGES

How does the brain process speech? We now know the answer, and it’s
Speech and Brain
How does the brain process speech? We now know the answer, and it’s
How Does The Brain Process Speech? Easily Explained
Speech
The Brain: Broca's and Wernicke's Areas and the Circle of Willis

VIDEO

Roasting text to speech brain rot stories
Putting the world to right? Free speech. Brain needed
१० वी निरोप समारंभ भाषण|#10th Send of Speech@BrainWash
3rd Annual Head to Speech! Brain Health, sports concussions, and their effect on cognitive skills
How Does the Brain Understand Speech? An Overview
how to week Brain 🧠 people 😦 #youtubefeed #shortvideo #mostpopular

COMMENTS

SpeechBrain: Open-Source Conversational AI for Everyone
SpeechBrain supports state-of-the-art technologies for speech recognition, enhancement, separation, text-to-speech, speaker recognition, speech-to-speech translation, spoken language understanding, and beyond. ... class ASR_Brain(sb.Brain): def compute_forward(self, batch, stage): # Compute features (mfcc, fbanks, etc.) on the fly features ...
GitHub
With the rise of deep learning, once-distant domains like speech processing and NLP are now very close.A well-designed neural network and large datasets are all you need. We think it is now time for a holistic toolkit that, mimicking the human brain, jointly supports diverse technologies for complex Conversational AI systems.. This spans speech recognition, speaker recognition, speech ...
speechbrain (SpeechBrain)
SpeechBrain is an open-source and all-in-one conversational AI toolkit based on PyTorch. We released to the community models for Speech Recognition, Text-to-Speech, Speaker Recognition, Speech Enhancement, Speech Separation, Spoken Language Understanding, Language Identification, Emotion Recognition, Voice Activity Detection, Sound Classification, Grapheme-to-Phoneme, and many others.
speechbrain · PyPI
SpeechBrain is an open-source PyTorch toolkit that accelerates Conversational AI development, i.e., the technology behind speech assistants, chatbots, and large language models. ... The Brain class serves as a fully customizable tool for managing training and evaluation loops over data. It simplifies training loops while providing the ...
SpeechBrain: A PyTorch Speech Toolkit
SpeechBrain supports state-of-the-art methods for end-to-end speech recognition, including models based on CTC, CTC+attention, transducers, transformers, and neural language models relying on recurrent neural networks and transformers. Speaker Recognition. Speaker recognition is already deployed in a wide variety of realistic applications. ...
SpeechBrain: A PyTorch Speech Toolkit
Text-to-Speech (TTS, also known as Speech Synthesis) allows users to generate speech signals from an input text. SpeechBrain supports popular models for TTS (e.g., Tacotron2) and Vocoders (e.g, HiFIGAN). ... class ASR_Brain(sb.Brain): def compute_forward(self, batch, stage): # Compute features (mfcc, fbanks, etc.) on the fly features = self ...
SpeechBrain · GitHub
speechbrain.github.io Public The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
GitHub
The SpeechBrain Toolkit. SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone ...
About SpeechBrain
1. Researchers: Propose and implement new functionalities through pull requests. Each new pull request undergoes review by a core team member to maintain high-quality standards. 2. Sponsors: Contribute financially or with human resources by reaching out to us via email. Also, check out our 2024 call for sponsors.
[2106.04624] SpeechBrain: A General-Purpose Speech Toolkit
SpeechBrain: A General-Purpose Speech Toolkit. SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the research and development of neural speech processing technologies by being simple, flexible, user-friendly, and well-documented. This paper describes the core architecture designed to support several tasks ...
User guide
User guide. SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. This documentation is intended to give SpeechBrain users all the API information necessary to develop their projects. For tutorials, please refer to the official Github or the official Website.
PDF SpeechBrain: A General-Purpose Speech Toolkit
tasks at once, for example, recognize speech, understanding its content, language, emotions, and speakers. Our toolkit is not only intended for speech researchers, but also for the broader machine learning community, enabling users to easily integrate their models into different speech pipelines and compare them with state-of-the-art (SotA ...
SpeechBrain
Welcome to the SpeechBrain YouTube channel, your hub for open-source conversational AI insights. Join us for talks, presentations, and tutorials. Explore more at https://speechbrain.github.io/"
Quick installation
Quick installation. SpeechBrain is constantly evolving. New features, tutorials, and documentation will appear over time. SpeechBrain can be installed via PyPI to rapidly use the standard library. Moreover, a local installation can be used to run experiments and modify/customize the toolkit and its recipes. SpeechBrain supports both CPU and GPU ...
Introducing SpeechBrain: A general-purpose PyTorch speech ...
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to make the research and development of neural speech processing technologies easier by being simple, flexible, user-friendly, and well-documented. We designed it to natively support multiple speech tasks of common interest, including:
Which Side of the Brain Controls Speech?
Medically reviewed by Heidi Moawad, M.D. — By Jared C. Pistoia, ND on February 22, 2023. The left side of your brain controls voice and articulation. The Broca's area, in the frontal part of the ...
SpeechBrain Basics
SpeechBrain is an open-source all-in-one speech toolkit based on PyTorch. It is designed to make the research and development of speech technology easier. Alongside with our documentation this tutorial will provide you all the very basic elements needed to start using SpeechBrain for your projects. Open in Google Colab.
speechbrain/metricgan-plus-voicebank · Hugging Face
@misc{speechbrain, title={{SpeechBrain}: A General-Purpose Speech Toolkit}, author={Mirco Ravanelli and Titouan Parcollet and Peter Plantinga and Aku Rouhe and Samuele Cornell and Loren Lugosch and Cem Subakan and Nauman Dawalatabad and Abdelwahab Heba and Jianyuan Zhong and Ju-Chieh Chou and Sung-Lin Yeh and Szu-Wei Fu and Chien-Feng Liao and ...
Speech planning: How our brains prepare to speak
Before speech occurs, the brain must plan and coordinate the specific movements of the mouth, tongue, vocal cords, and lungs that produce speech. This involves the motor cortex, which controls ...
GitHub
About. The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Unraveling Predictive Mechanism in Speech Perception and ...
How neural networks coordinate to support speech perception and speech production represents a forefront research topic in both contemporary neuroscience and artificial intelligence. Despite the successful incorporation of hierarchical and predictive attributes from biological neural networks (BNNs) into artificial counterparts, substantial disparities persist, particularly in terms of real ...
Berkeley Voices: One brain, two languages
Anne Brice: With advanced brain imaging technologies, researchers have learned a lot about where and how language is processed in the brain — and they continue to learn more. Since 2020, Davidson has been part of a research team at UC San Francisco that's working with bilinguals to map where exactly in the brain particular sound features of ...
Tutorials
Tutorials. A good way to familiarize yourself with SpeechBrain is to take a look at the Colab tutorials that we made available. More tutorials will be made available as the project will progress. The full list of tutorials can be found on the official website. All the tutorials are developed on the Google Colab platform.
Aphasia and Stroke
Aphasia is a language disorder that affects your ability to communicate. It's most often caused by strokes in the left side of the brain that control speech and language. People with aphasia may struggle with communicating in daily activities at home, socially or at work. They may also feel isolated. Aphasia doesn't affect intelligence.
The Evolving Danger of the New Bird Flu
The Evolving Danger of the New Bird Flu. An unusual outbreak of the disease has spread to dairy herds in multiple U.S. states. April 22, 2024, 6:00 a.m. ET. Share full article. Hosted by Sabrina ...
SpeechBrain: A PyTorch Speech Toolkit
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains. ... class ASR_Brain(sb.Brain): def compute_forward(self, batch, stage): # Compute features (mfcc, fbanks, etc.) on the fly feats = self ...
Brain-computer interfaces for handwriting, speech and beyond
Brain-computer interfaces (BCIs) that detect the desire to move directly from the brain can restore communication via thought alone. My work has developed handwriting and speech BCIs that set new performance records, showing for the first time that the intention to talk or write can be neurally decoded at rapid speeds in people who have not ...
SpeechBrain Advanced
This tutorial will show you how to load large datasets from the shared file system and use them for training a neural network with SpeechBrain. In particular, we describe a solution based on the WebDataset library, that is easy to integrate within the SpeechBrain toolkit. Open in Google Colab. SpeechBrain Advanced.
Was Joe Biden's uncle eaten by cannibals during World War II?
Biden's oldest son, Beau Biden, was an Iraq War veteran who died of brain cancer in 2015. A politician's history of embellishment.
Researchers illustrate how caregiver speech shapes the infant brain
A team led by a University of Texas at Dallas neurodevelopment researcher has uncovered some of the most conclusive evidence yet that parents who talk more to their infants improve their babies' brain development. The researchers used MRI and audio recordings to demonstrate that caregiver speech is associated with infant brain development in ...

speechbrain 1.0.0

Verified details

Unverified details

Classifiers

Project description

🗣️💬 What SpeechBrain Offers

📚 Training Recipes

Pretrained Models and Inference

Documentation

🎯 Use Cases

🚀 Quick Start

🛠️ Installation

Install from GitHub

✔️ Test Installation

🏃‍♂️ Running an Experiment

📘 Learning SpeechBrain

🔧 Supported Technologies

🎙️ Speech/Audio Processing

📊 Performance

🔮Future Plans

🤝 Contributing

📖 Citing SpeechBrain

Project details

Download files

Source Distribution

Built Distribution

Hashes for speechbrain-1.0.0.tar.gz

SpeechBrain

Key Features

Speech Recognition

Speaker Recognition

Speech Enhancement

Speech Processing

Multi Microphone Processing

Text-to-Speech

Research & Development

HuggingFace!

Why SpeechBrain?

Adapts to your needs.

Previous Sponsors

Collaborators

Introducing SpeechBrain: A general-purpose PyTorch speech processing toolkit

What is SpeechBrain?

Usage Example

Inference with a pre-trained model

Training a model

Future Plans

Acknowledgments

Contributors

Similar articles

speechbrain / metricgan-plus-voicebank like 45

Install SpeechBrain

Pretrained Usage

Inference on GPU

Limitations

Referencing MetricGAN+

About SpeechBrain

Citing SpeechBrain

Spaces using speechbrain/metricgan-plus-voicebank 30

Navigation Menu

Saved searches

speechbrain/speechbrain.github.io

Unraveling Predictive Mechanism in Speech Perception and Production: Insights from EEG Analyses of Brain Network Dynamics

Cite this article

Access this article

Author information

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Share this article

Document code

The Evolving Danger of the New Bird Flu

Listen and follow The Daily Apple Podcasts | Spotify | Amazon Music

On today’s episode

Background reading

SpeechBrain

Key Features

Speech Recognition