GitHub - DmitryRyumin/ICASSP-2023-24-Papers: ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

General Information
Repository Size and Activity
Contribution Statistics
Other Metrics
GitHub Actions
Application
Progress Status
Main

ICASSP 2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2024 conference. Explore the latest advancements in acoustics, speech and signal processing. Code included. ⭐ the repository to support the advancement of audio and signal processing!

Tip

Online version of the ICASSP 2024 Conference Technical Program, which lists all accepted full papers along with their presentation mode and time.

Other collections of the best AI conferences

Important

Conference table will be up to date all the time.

Conference	Year
Conference	2023	2024
Computer Vision (CV)
CVPR
ICCV
ECCV
WACV	➖
FG	➖
Speech/Signal Processing (SP/SigProc)
ICASSP
INTERSPEECH
ISMIR		➖
Natural Language Processing (NLP)
EMNLP
Machine Learning (ML)
AAAI	➖
ICLR	➖
ICML	➖
NeurIPS	➖

Contributors

Note

Contributions to improve the completeness of this list are greatly appreciated. If you come across any overlooked papers, please feel free to create pull requests, open issues or contact me via email. Your participation is crucial to making this repository even better.

Papers

Section	Papers
Main
Audio-Visual Speech Processing
Vision and Language
Acoustic Signal Processing
Deep Learning Techniques
Speech Enhancement and Separation - Diffusion and other Probabilistic Models
ASPS Lecture
Distributed and Federated Learning
Transfer Learning
Voice Conversion
Graph Neural Networks
Language Resources, Metrics and Systems
Watermarking and Data Hiding
Signal and Information Processing over Graphs
Integrated Sensing and Communications
Audio Events Detection and Classification; Music Information Retrieval
Language Understanding and Computational Semantics - NLP Tasks
Physiological and Wearable Signal Processing
Speech Enhancement; Music Information Retrieval
Multimodal Medical Image Fusion and Analysis
Sparse/Low-Dimensional Signal Processing
Robust and Sustainable Machine Learning
Machine Learning for Image and Video Processing
Deep Learning Generalization
Distributed Processing and Federated Learning
Biological Image Analysis
Learning from Multimodal Data
Biometrics
Detection and Classification
Multimedia Coding
Anonymisation, Data Privacy and Hiding
Quality Assessment and Anomaly Detection
Signal Filtering, Reconstruction, Restoration and Enhancement
Speech Emotion Recognition and Analysis
Deep Generative Models
Context and LLM Speech Recognition
Music Information Retrieval
Multimodal Processing: Vision + Language
Environmental Sound Synthesis and Generation
Biomedical and Biological Image Processing
DoA Estimation
Tracking
Machine Learning for Communications
Image and Video Processing for Watermarking and Security
Self-Supervised Learning for Speech Processing
Deep Learning for Image and Video Processing
Image, Video, and 3D Content Generation
Classification of Acoustic Scenes and Events
Reinforcement Learning
Subspace and Manifold Learning
Active Noise Control and Echo Cancellation; Source Separation
Machine Learning, Detection and Classification
Machine Learning for Audio, Speech and Music Processing
Multimedia Generation and Synthesis
Medical Image Detection and Segmentation
Multimedia Forensics and Cybersecurity
Estimation Theory and Methods
Emerging Methods for Biomedical Image and Signal Processing
Text to Speech Generation
Audio Classification, Detection and Localization
Self-Supervised and Semi-Supervised Learning
Multichannel/Multimodal Speech Recognition
Speaker Verification
Speaker Diarization
Adversarial Machine Learning
Machine Learning Methods for Language
SPED: Signal Processing Education
Multimedia Quality of Experience
Domain-Enriched Learning for Medical Image Processing
Speech Enhancement and Separation
Image Denoising
ASPS Poster
ASR - New Algorithms and Approaches
Data Mining and Big Data
Language Understanding and Computational Semantics - Machine Learning
Explainable and Interpretable Machine Learning
Neuroimaging and Brain/Human-Computer Interfaces
Localization, DOA Estimation, Spatial Audio Recording and Reproduction
Perception and Processing for Autonomous Systems and Applications
Computational Imaging
Audio and Speech Quality and Intelligibility Measures; Music Analysis
Medical Image Formation, Reconstruction and Restoration
Audio and Speech Source Separation
Text-based Customization for Speech-to-Text
Deep Learning Models
Next-Gen Communication Systems
Image Restoration
Robustness and Trustworthy Machine Learning
Signal Processing over Networks
3D Understanding
Compressed Sensing and Machine Learning for Multi-Sensor Systems
LIMMITS: Multi-Speaker, Multi-Lingual Indic TTS with Voice Cloning
Natural Language Processing for Speech-to-Text
Resource Constrained Acoustic and Language Modeling
Dereverberation and RIR Estimation; Speech Enhancement and Restoration
Image/Video Super-Resolution
Matrix Factorization and Source Separation
Beamforming for Audio and Speech; Music Signal Analysis, Processing and Synthesis
Summarization, Retrieval and Language Learning
Sequential Learning and Sequential Decision Methods
MIMO and Massive MIMO Communication Systems
Multimodal Emotion/Sentiment Analysis
Human Understanding
Image and Video Synthesis
MIMO and High-Frequency Communications
Image and Video Super-Resolution
Spatial Audio Recording and Reproduction
Audio Signal Restoration and Speech Enhancement
Discourse and Dialog
Bayesian Signal Processing
Pattern Recognition and Classification
Key Word Spotting
Speech Analysis - Pitch, Spectrum and Voice Disorders
Grand Challenge on Hyperspectral Skin Vision
Robust Speech Recognition and Adaptation
Speech Analysis and Language Disorder Analysis
Aspects in Image/Video Processing and Analysis
DoA Estimation and Source Localization
Multimodal Processing of Language
Source separation; Music analysis
Machine Learning for Time Series Analysis
Multimedia Search and Retrieval
Anomaly Detection; Sound Event Detection and Localization
Acoustic Array and Signal Processing
Music Signal Analysis and Processing
Language Understanding and Computational Semantics - Language Models
Deep Learning Theory
Anti-Spoofing	Will soon be added
Pose, Gesture, and Action in Multimedia
Sampling Theory, Compressed and Non-Uniform Sampling
MIMO and Massive MIMO Systems
Multimodal and Emerging Medical Signal Analysis
The RF Signal Separation Challenge
Signal Processing for Communications
Audio and Speech Modeling, Coding and Transmission; Spatial Audio Recording and Reproduction
Voice Conversion: Singing, Accent and Emotion
Other Machine Learning Applications
Speaker Recognition and Anonymization
Feature Extraction Selection and Learning
Music Information Retrieval; Quality and Intelligibility Measures
Learning Theory and Performance Bound
Human-Centric Multimedia
Multilingual Speech Recognition and Identification
Image Recognition and Detection
Signal Processing over Graphs and Networks
End-to-End Modeling for Automatic Speech Recognition
Segmentation, Tagging, and Parsing of Language
Detection
Audio-Language Processing and Audio Captioning
Action Recognition
Image, Video and Other Applications
Multimodal Information Based Speech Processing (MISP)
Next-Gen Communications and PHY Security
Network and System Security
Target Source Extraction; Active Noise Control, Echo Reduction and Feedback Reduction
Machine Translation for Spoken and Written Language
Sound Events Detection, Description and Generation
Applied Cryptography
Machine/Deep Learning Methodologies for Multimedia
Speech Separation and Extraction
Signal Processing and Machine Learning for Communications
Audio Coding
Active Noise Control and Echo Cancellation
Bayesian Machine Learning
Advancing the Frontiers of Deep Learning for Low-Dose 3D Cone-Beam CT Reconstruction
Bioacoustics and Medical Acoustics; Audio Security
Acoustic Modeling for Automatic Speech Recognition
Multimodal Processing of Speech
IFS General
3D Image and Video Processing and Analysis
Deep Learning Training Methods
Key Word Spotting and Acoustic Event Detection
Coding, Information Theory, and Applications of Signal Processing for Communications
Speech Analysis
Music Separation; Audio for Multimedia and Audio Processing Systems
Machine Learning for Communications and Wireless Networks
Image and Video Coding/Compression
Bioinformatics and Biomedical Signal Processing
Audio-Visual Speech/Intent Recognition
Multimodal Clustering, Segmentation, and Summarization
Learning Theory and Methods
SP Cadenza Challenge: Music Demixing/Remixing for Hearing Aids
Radar Signal Processing
Biological and Medical Signal and Image Processing
Anti-Spoofing and Speaker Embedding
Speech Enhancement; Dereverberation and RIR Estimation
Segmentation
3D Generation
Multimedia Forensics
Speech Signal Improvement Challenge
Audio Deep Packet Loss Concealment Grand Challenge
Signal Processing Theory and Methods Journal Papers
Multi-Sensor and Multichannel Signal Processing
Array Processing and Beamforming
Sound Event Classification and Generation; Active Noise Control, Echo Reduction and Feedback Reduction
Deep Learning Fairness and Privacy
Sparsity and Low-Rank Models
Optimization Methods for Signal Processing
Multimodal Processing
Show and Tell Demos
Special Session
Model based Machine Learning for Wireless Communications and Sensing	Will soon be added
Exploiting Diversities in Advanced Array Systems: New Applications and Trends
Generative Semantic Communication: How Generative Models Enhance Semantic Communications
Quantum Machine Learning Algorithms and Applications on NISQ Devices
Robust Reconstruction Methods in Computational Imaging
Graphical Inference and Modeling in Dynamical Systems
Advancements in Integrated Sensing and Communication for Next-Generation Wireless Networks
Signal and Graph Processing for Autonomous Agents
Next-Generation Wi-Fi Sensing
Signal Processing Theory for Covert Communication and Cybersecurity
In-Context Learning Methods for Speech and Spoken Language Processing
Topological Signal Processing over Higher-Order Networks
Deepfakes and AI-Generated Content (AIGC) Detection and Forensics: Recent Advances
Recent Advances in AI-Powered Visual Computing and Multimodal Signal Processing for Metaverse Era
Algorithm-Hardware Co-Design of Neuromorphic Solutions for Signal Processing Applications
Automotive Radar Signal Processing for Autonomous Driving
Learning with Incomplete Medical Data
Signal Processing and Machine Learning for Collective Intelligence
Variational Inference and Approximate Bayesian Techniques
Efficient Modeling of Long Sequences with Applications to Speech and Audio
Decentralized Learning with Resource-Constrained Communication
Localization and Sensing based on Signals from Terrestrial and Non-Terrestrial Networks
Signal Processing and Machine Learning for Understanding Brain Dynamics

Name		Name	Last commit message	Last commit date
Latest commit History 929 Commits
.github/workflows		.github/workflows
code		code
images		images
json_data/2024/main		json_data/2024/main
scripts		scripts
sections/2024/main		sections/2024/main
.flake8		.flake8
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
README_2023.md		README_2023.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Contributors

Papers

Key Terms

Star History

About

Releases

Packages

Contributors 14

Languages

License

DmitryRyumin/ICASSP-2023-24-Papers

Folders and files

Latest commit

History

Repository files navigation

Contributors

Papers

Key Terms

Star History

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 14

Languages

Packages