FrostNet: Towards Quantization-Aware Network Architecture Search
-
Updated
May 3, 2024 - Python
FrostNet: Towards Quantization-Aware Network Architecture Search
A Python package for simulating low precision arithmetic in scientific computing and machine learning
🎯 Fine-tune large language models and use them for text-related tasks. This repository provides a straightforward approach to fine-tuning models like Gemma, Llama 🦙, and Mistral 🌪️ for various NLP tasks. 🔧 It includes training 📚, fine-tuning 🛠️, and inference pipelines ⚙️. 🚀
Post quantization with TensorFlow and model compilation with TVM
ResNet18 model optimization for CIFAR-10 using Post-Training and Quantization-Aware Training (PTQ/QAT) to reduce size and improve inference.
Add a description, image, and links to the post-quantization topic page so that developers can more easily learn about it.
To associate your repository with the post-quantization topic, visit your repo's landing page and select "manage topics."