Open Source AI Tools — Comprehensive List

Core Frameworks & Libraries

TensorFlow – End-to-end ML platform by Google. DigitalOcean+1
Link: https://tensorflow.org/
PyTorch – Dynamic graph deep-learning library favored for research. Wikipedia
Link: https://pytorch.org/
Keras – High-level neural network API (runs on TensorFlow). Pesto Tech
Link: https://keras.io/
Scikit‑learn – Standard ML library for Python (classification, regression, clustering). Pesto Tech
Link: https://scikit-learn.org/
OpenCV – Open-source computer vision library. DigitalOcean
Link: https://opencv.org/
NVIDIA NeMo – Toolkit for building speech/NLP apps via modular neural blocks. arXiv
Link: https://github.com/NVIDIA/NeMo
NNI (Neural Network Intelligence) – AutoML toolkit for hyperparameter tuning, NAS. Wikipedia
Link: https://github.com/microsoft/nni
PyTorch Lightning – Lightweight wrapper for PyTorch to structure research/production. Wikipedia
Link: https://github.com/Lightning-AI/lightning
DGL (Deep Graph Library) – Graph-neural-network library.
Link: https://www.dgl.ai/
FastAI – High-level library built on top of PyTorch aimed at fast prototyping.
Link: https://www.fast.ai/
NLP, LLMs, Transformers
Hugging Face Transformers – Library of pretrained transformer models. Wikipedia
Link: https://github.com/huggingface/transformers
OpenNMT – Open-source toolkit for neural machine translation (NMT). Wikipedia
Link: https://opennmt.net/
Rasa – Open-source conversational AI framework (chatbots). Pesto Tech
Link: https://rasa.com/open-source/
LLama‑2 (open versions) – Open/available large language models (various community versions).
Link: https://github.com/facebookresearch/llama
h2oGPT – Open-source LLM suite from H2O.ai for private use. arXiv
Link: https://github.com/h2oai/h2ogpt
SimplyRetrieve – Lightweight retrieval-centric generative AI tool. arXiv
Link: https://github.com/RCGAI/SimplyRetrieve
LangChain – Framework for building LLM-based agents, RAG workflows.
Link: https://github.com/hwchase17/langchain
PEFT (Parameter‑Efficient Fine‑Tuning) – Library for efficient fine-tuning of large models.
Link: https://github.com/huggingface/peft
Adapter‑Hub – Tool for adapter-based fine-tuning of transformer models.
Link: https://adapterhub.ml/
OpenAI Gym – Toolkit for developing and comparing reinforcement-learning algorithms; also used for RL in NLP. KDnuggets
Link: https://github.com/openai/gym
Computer Vision & Generative Media
Detectron2 – Facebook’s object detection & segmentation library.
Link: https://github.com/facebookresearch/detectron2
ComfyUI – Node-based interface for image generation workflows (diffusion). Wikipedia
Link: https://github.com/comfyanonymous/ComfyUI
Stable Diffusion – Open-source text-to-image model enabling generation of visuals. KDnuggets
Link: https://github.com/CompVis/stable-diffusion
ControlNet – Extends diffusion models with conditional control (pose, structure).
Link: https://github.com/lllyasviel/ControlNet
MediaPipe – Cross-platform framework for building multimodal (vision, audio) pipelines.
Link: https://github.com/google/mediapipe
OpenPose – Real-time pose estimation library.
Link: https://github.com/CMU-Perceptual-Computing-Lab/openpose
MMDetection – Open-source toolbox for object detection.
Link: https://github.com/open-mmlab/mmdetection
Albumentations – Fast image augmentation library, useful in vision pipelines.
Link: https://github.com/albumentations-team/albumentations
YOLOv5 – Real-time object detection in a lightweight package.
Link: https://github.com/ultralytics/yolov5
DeepFaceLab – Open source deep-fake generation toolkit.
Link: https://github.com/iperov/DeepFaceLab
Retrieval, RAG & Knowledge-Graph Tools
FAISS – Facebook’s library for efficient similarity search of embeddings.
Link: https://github.com/facebookresearch/faiss
Milvus – Open-source vector database optimized for embeddings.
Link: https://github.com/milvus-io/milvus
Weaviate – Cloud-native vector database with semantic search support.
Link: https://github.com/weaviate/weaviate
Redis‑Vector – Redis module for vector search & semantic retrieval.
Link: https://github.com/redis/redis-vector
Qdrant – Vector similarity database designed for production workloads.
Link: https://github.com/qdrant/qdrant
Neo4j – Graph database commonly used in knowledge-graph + RAG workflows.
Link: https://neo4j.com/
AmpliGraph – Toolkit for learning knowledge-graph embeddings (TransE, RotatE).
Link: https://github.com/Accenture/AmpliGraph
Haystack – Open-source RAG pipeline for document search + LLM integration.
Link: https://github.com/deepset-ai/haystack
OpenSearch – Open-source search & analytics engine; can be used with vector plugins for AI search.
Link: https://github.com/opensearch-project/opensearch
Sentence‑Transformers – Library for embedding generation (sentence/paragraph) using transformers.
Link: https://github.com/UKPLab/sentence-transformers
Audio / Music / Speech
Open Unmix – Open-source music source separation.
Link: https://github.com/sigsep/open-unmix
Spleeter – Real-time music demixing library by Deezer.
Link: https://github.com/deezer/spleeter
ESPnet – End-to-end speech and audio processing toolkit.
Link: https://github.com/espnet/espnet
WaveNet – Generative audio model (Google) open-source version exists.
Link: https://github.com/ibab/tensorflow-wavenet
NeMo‑ASR – Speech-recognition modules from NVIDIA’s NeMo toolkit.
Link: https://github.com/NVIDIA/NeMo
Magenta – Music and art generation research by Google.
Link: https://github.com/magenta/magenta
MusicVAE – VAE approach for music generation.
Link: https://github.com/magenta/magenta/tree/main/music_vae
Sound‑Stream – Open-source model for high-fidelity audio generation.
Link: https://github.com/google/sound-stream
Coqui‑TTS – Open-source speech-synthesis toolkit (TTS).
Link: https://github.com/coqui-ai/tts
Whisper – OpenAI’s open-source speech-to-text model.
Link: https://github.com/openai/whisper
Reinforcement Learning & Simulation
Stable Baselines3 – RL library with implementations of PPO, SAC, etc.
Link: https://github.com/DLR-RM/stable-baselines3
RLlib – Scalable RL library part of Ray ecosystem.
Link: https://github.com/ray-project/ray
Gymnasium – Successor to OpenAI Gym for RL environments.
Link: https://github.com/Farama-Foundation/Gymnasium
Mujoco‑Py – Reinforcement learning physics simulator (Python wrapper).
Link: https://github.com/openai/mujoco-py
Isaac Gym – GPU-accelerated RL physics simulation by NVIDIA.
Link: https://github.com/NVIDIA-Omniverse/IsaacGym
PettingZoo – Multi-agent RL environments library.
Link: https://github.com/Farama-Foundation/PettingZoo
Acme – DeepMind’s RL research library.
Link: https://github.com/deepmind/acme
Stable‑Diffusion‑RL – Combines RL with generative diffusion (research).
Link: https://github.com/CompVis/stable-diffusion-rl
OpenAI SpinningUp – Educational RL library.
Link: https://github.com/openai/spinningup
Dopamine – RL research framework from Google.
Link: https://github.com/google/dopamine
Deployment, Monitoring & MLOps
MLflow – Platform for experiment tracking, model registry.
Link: https://github.com/mlflow/mlflow
Weights & Biases – Experiment tracking & model management (free tier).
Link: https://github.com/wandb/client
BentoML – Model serving library for ML/AI.
Link: https://github.com/bentoml/BentoML
TorchServe – Serving PyTorch models in production.
Link: https://github.com/pytorch/serve
Seldon Core – Kubernetes-based open source ML deployment toolkit.
Link: https://github.com/SeldonIO/seldon-core
Kubeflow – ML orchestration on Kubernetes.
Link: https://github.com/kubeflow/kubeflow
Flyte – Workflow automation for ML pipelines.
Link: https://github.com/flyteorg/flyte
DVC (Data Version Control) – Versioning for data & models.
Link: https://github.com/iterative/dvc
OpenTelemetry – Observability framework used for ML monitoring.
Link: https://github.com/open-telemetry/opentelemetry-specification
Prometheus – Monitoring & alerting toolkit (can integrate ML metrics).
Link: https://github.com/prometheus/prometheus
Miscellaneous / Emerging / Domain-Specific
LightAgent – Open-source agentic AI framework for multi-agent workflows. arXiv
Link: https://github.com/wxai-space/LightAgent
DeepFaceLab – Deep-fake creation toolkit (ethical research use).
Link: https://github.com/iperov/DeepFaceLab
AutoGPT – Agent framework built on LLMs (open-source variants).
Link: https://github.com/Significant-Gravitas/Auto-GPT
ToolJet – Open-source low-code/no-code platform with AI integrations. Medium
Link: https://www.tooljet.ai/
CVAT – Web-based annotation tool for image/video labeling. Wikipedia
Link: https://github.com/cvat-ai/cvat
Awesome‑Open‑Source‑AI‑Tools – Curated list of 700+ open-source AI tools. GitHub
Link: https://github.com/mahseema/awesome-ai-tools
OpenHands – Open-source solution for hand-gesture recognition.
Link: https://github.com/open-mmlab/openhands
CNTK (Microsoft Cognitive Toolkit) – Deep-learning toolkit by Microsoft (older but still open).
Link: https://github.com/microsoft/CNTK
ELMo‑OpenSource – Open-source pretrained embedding model (older NLP).
Link: https://github.com/allenai/allennlp
ONNX Runtime – Runtime for running ML models in multiple frameworks, open-source.
Link: https://github.com/microsoft/onnxruntime