Check the CHANGELOG file to have a global overview of the latest modifications ! π
βββ custom_architectures
βββ custom_layers
βββ custom_train_objects
β βββ generators
β β βββ file_cache_generator.py : abstract generator caching processed data
β β βββ ge2e_generator.py : generator dedicated to format GE2E input
β βββ losses
β β βββ ge2e_loss.py : Keras 3 implementation of the GE2E loss
β βββ metrics
β β βββ ge2e_metric.py : custom metric associated to the GE2E loss
βββ loggers
βββ models
β βββ encoder
β β βββ audio_encoder.py : audio encoder class (audio to audio comparison with GE2E loss)
β β βββ base_encoder.py : abstract Encoder class (trained with the GE2E loss)
β β βββ text_encoder.py : text encoder that uses pretrained embedding models
βββ pretrained_models
βββ unitests
βββ utils
βββ speaker_verification.ipynb
βββ information_retrieval.ipynb
Check the main project for more information about the unextended modules / structure / main classes.
Important Note : this project is the keras 3 extension of the siamese network project. All features are not available yet. Once the convertion will be completely finished, the siamese networks project will be removed in favor of this one.
Input types | Dataset | Architecture | Embedding dim | Trainer | Weights |
---|---|---|---|---|---|
mel-spectrogram | VoxForge, CommonVoice | AudioEncoder (CNN 1D + LSTM) |
256 | me | Google Drive |
Models must be unzipped in the pretrained_models/
directory !
Check this installagion guide for the step-by-step instructions !
- Make the TO-DO list
- Comment the code
- Optimize
KNN
in purekeras 3
- Implement the
clustering
procedure - Implement the
similarity matrix
evaluation procedure - Implement the
clustering
evaluation procedure - Convert the
siamese_networks
project :- Implement the
BaseEncoder
class - Implement the
BaseSiamese
class - Implement the
BaseComparator
class - Implement the
SiameseGenerator
class - Update the README to provide more information about evaluation of encoders
- Implement the
- Implement text embedding models
- Implement a vectors database for information retrieval
- Implement a colbert-vectors database for fine-grained search
Contacts :
- Mail :
yui-mhcp@tutanota.com
- Discord : yui0732
The goal of these projects is to support and advance education and research in Deep Learning technology. To facilitate this, all associated code is made available under the GNU Affero General Public License (AGPL) v3, supplemented by a clause that prohibits commercial use (cf the LICENCE file).
These projects are released as "free software", allowing you to freely use, modify, deploy, and share the software, provided you adhere to the terms of the license. While the software is freely available, it is not public domain and retains copyright protection. The license conditions are designed to ensure that every user can utilize and modify any version of the code for their own educational and research projects.
If you wish to use this project in a proprietary commercial endeavor, you must obtain a separate license. For further details on this process, please contact me directly.
For my protection, it is important to note that all projects are available on an "As Is" basis, without any warranties or conditions of any kind, either explicit or implied. However, do not hesitate to report issues on the repository's project, or make a Pull Request to solve it π
If you find this project useful in your work, please add this citation to give it more visibility ! π
@misc{yui-mhcp
author = {yui},
title = {A Deep Learning projects centralization},
year = {2021},
publisher = {GitHub},
howpublished = {\url{https://github.com/yui-mhcp}}
}
Tutorials :
- Medium tutorial for speaker verification with siamese networks.
- Google GE2E Loss tutorial : amazing Google tutorial explaining the benefits of the GE2E loss compared to the Siamese approach (which is really similar to their
Tuple End-to-End (TE2E) loss
principle) - LLama-index tutorial on information retrieval with dense vectors
Github project :
- voicemap project : nice project for speaker verification.
- OpenAI's CLIP : the official
CLIP
implementation in pytorch. - LLama-index : well known library for information retrieval