vision-transformer

Star

Here are 945 public repositories matching this topic...

open-mmlab / mmdetection

Star

OpenMMLab Detection Toolbox and Benchmark

Updated Aug 21, 2024
Python

lukas-blecher / LaTeX-OCR

Star

pix2tex: Using a ViT to convert images of equations into LaTeX code.

python machine-learning ocr latex deep-learning image-processing pytorch dataset transformer vit image2text im2text im2latex im2markup math-ocr vision-transformer latex-ocr

Updated Dec 5, 2024
Python

NielsRogge / Transformers-Tutorials

Star

This repository contains demos I made with the Transformers library by HuggingFace.

transformers pytorch bert gpt-2 layoutlm vision-transformer

Updated Oct 21, 2024
Jupyter Notebook

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

transformers generative-model image-generation auto-regressive-model gpt neurips gpt-2 diffusion-models autoregressive-models vision-transformer large-language-models generative-ai

Updated Dec 6, 2024
Python

adithya-s-k / omniparse

Star

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

ocr parser-library web-crawler parse-server whisper-api ingestion-api vision-transformer omniparser

Updated Nov 3, 2024
Python

cmhungsteve / Awesome-Transformer-Attention

Star

An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

computer-vision deep-learning transformers transformer awesome-list vit papers attention-mechanism attention-mechanisms self-attention transformer-architecture transformer-models detr vision-transformer transformer-cv transformer-with-cv transformer-awesome visual-transformer

Updated Jul 30, 2024

JingyunLiang / SwinIR

Star

SwinIR: Image Restoration Using Swin Transformer (official repository)

decompression transformer super-resolution image-denoising image-restoration restoration denoising image-super-resolution low-level-vision deblocking vision-transformer image-deblocking compression-artifact-reduction real-world-image-super-resolution lightweight-image-super-resolution image-sr

Updated May 14, 2024
Python

huawei-noah / Efficient-AI-Backbones

Star

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

tensorflow pytorch transformer imagenet convolutional-neural-networks pretrained-models model-compression efficient-inference ghostnet vision-transformer

Updated Nov 30, 2024
Python

open-mmlab / mmpretrain

Star

OpenMMLab Pre-training Toolbox and Benchmark

deep-learning pytorch image-classification resnet pretrained-models clip mae mobilenet moco multimodal self-supervised-learning constrastive-learning beit vision-transformer swin-transformer masked-image-modeling convnext

Updated Nov 1, 2024
Python

google-research / scenic

Star

Scenic: A Jax Library for Computer Vision Research and Beyond

research computer-vision deep-learning transformers attention jax vision-transformer

Updated Dec 18, 2024
Python

towhee-io / towhee

Star

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

machine-learning computer-vision pipeline image-processing embeddings transformer video-processing feature-extraction convolutional-networks vit feature-vector image-retrieval unstructured-data embedding-vectors milvus vision-transformer towhee llm

Updated Oct 18, 2024
Python

InternLM / InternLM-XComposer

Star

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

foundation gpt language-model multimodal multi-modality vision-transformer gpt-4 visual-language-learning llm chatgpt instruction-tuning large-language-model supervised-finetuning mllm vision-language-model large-vision-language-model

Updated Dec 18, 2024
Python

mit-han-lab / efficientvit

Star

Efficient vision foundation models for high-resolution generation and perception.

imagenet segmentation high-resolution vision-transformer efficientvit segment-anything deep-compression-autoencoder efficient-diffusion-model

Updated Dec 9, 2024
Python

baaivision / EVA

Star

EVA Series: Visual Representation Fantasies from BAAI

representation-learning vision-transformer foundation-models

Updated Aug 1, 2024
Python

hila-chefer / Transformer-Explainability

Star

[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

deep-learning vit bert perturbation attention-visualization bert-model explainability attention-matrix vision-transformer transformer-interpretability visualize-classifications cvpr2021

Updated Jan 24, 2024
Jupyter Notebook

alibaba / EasyCV

Star

An all-in-one toolkit for computer vision

computer-vision transformers pytorch classification object-detection self-supervised-learning vision-transformer

Updated Jul 18, 2024
Python

microsoft / Cream

Star

This is a collection of our NAS and Vision Transformer work.

efficiency nas knowledge-distillation rpe automl vision-transformer vit-compression

Updated Jul 25, 2024
Python

OpenGVLab / InternVideo

Star

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Updated Dec 11, 2024
Python

MCG-NJU / VideoMAE

Star

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

pytorch transformer action-recognition video-understanding mae video-analysis video-representation-learning self-supervised-learning masked-autoencoder vision-transformer video-transformer neurips-2022

Updated Dec 8, 2023
Python

ViTAE-Transformer / ViTPose

Star

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"

deep-learning pytorch pose-estimation mae distillation self-supervised-learning vision-transformer

Updated Jul 24, 2024
Python

Improve this page

Add a description, image, and links to the vision-transformer topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-transformer topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vision-transformer

Here are 945 public repositories matching this topic...

open-mmlab / mmdetection

lukas-blecher / LaTeX-OCR

NielsRogge / Transformers-Tutorials

FoundationVision / VAR

adithya-s-k / omniparse

cmhungsteve / Awesome-Transformer-Attention

JingyunLiang / SwinIR

huawei-noah / Efficient-AI-Backbones

open-mmlab / mmpretrain

google-research / scenic

towhee-io / towhee

InternLM / InternLM-XComposer

mit-han-lab / efficientvit

baaivision / EVA

hila-chefer / Transformer-Explainability

alibaba / EasyCV

microsoft / Cream

OpenGVLab / InternVideo

MCG-NJU / VideoMAE

ViTAE-Transformer / ViTPose

Improve this page

Add this topic to your repo