Table of Contents

AI Model Compendium

COMPLETE AI MODEL REFERENCE — Full

COMPLETE AI MODEL REFERENCE

Concise descriptions, difficulty level, typical uses and example projects for major AI, ML and deep learning models (comprehensive 2025‑level list).

Classical Machine Learning Models

Model	Level	Description	Common Uses / Example Projects
Linear Regression	Beginner	Predict continuous targets via linear combination of features; teaches OLS and gradients.	House price prediction; sales/time-series forecasting; energy consumption modeling; baseline regression experiments; feature selection studies.
Logistic Regression	Beginner	Binary classification using sigmoid; outputs probabilities and interpretable coefficients.	Spam detection; medical screening; churn prediction; credit default classification; simple NLP classification with bag‑of‑words.
Decision Tree	Beginner	Hierarchical splits on features producing human‑readable rules; easy to visualize.	Credit scoring rules; diagnostic flowcharts; interpretable classification demos; feature importance visualizer; teaching decision logic.
Random Forest	Intermediate	Ensemble of randomized trees; reduces variance and overfitting via averaging.	Tabular baseline for industry problems; feature importance reports; anomaly detection; ecology / bioinformatics classification; model stacking component.
Gradient Boosting (XGBoost / LightGBM / CatBoost)	Intermediate	Sequentially built trees that focus on correcting prior errors; state‑of‑the‑art for tabular tasks.	Kaggle‑style tabular pipelines; credit risk scoring; demand forecasting; click‑through rate prediction; categorical-heavy datasets.
Support Vector Machine (SVM)	Intermediate	Finds maximum margin hyperplane; kernel trick enables non-linear boundaries.	Text categorization with TF‑IDF; face detection; small-sample classification; kernel comparison experiments.
K-Nearest Neighbors (KNN)	Beginner	Instance-based classifier using nearest labeled neighbors; simple but powerful for small datasets.	Recommender prototypes; handwriting recognition demos; local anomaly detection; content-based filtering.
K‑Means	Beginner	Partitions data into k clusters by minimizing within‑cluster variance.	Customer segmentation; color quantization for images; document clustering; music‑phrase grouping; prototyping for active learning.
DBSCAN	Intermediate	Density‑based clustering that finds arbitrary shapes and marks noise/outliers.	Geospatial clustering; anomaly detection on sensor streams; clustering noisy audio features; density-based segmentation.
PCA (Principal Component Analysis)	Intermediate	Linear dimensionality reduction projecting data onto principal variance directions.	Feature compression, visualization, denoising, pre-processing for downstream models, exploratory analysis of embeddings.
t‑SNE / UMAP	Intermediate	Nonlinear projection methods for visualizing high‑dimensional data in 2D/3D.	Visualize raag/phrase embeddings; cluster structure exploration; embedding space debugging; model representation comparisons.
Gaussian Mixture Model (GMM)	Intermediate	Probabilistic clustering using mixtures of Gaussians; gives soft cluster assignments.	Speaker diarization prototypes; density estimation; unsupervised phoneme modeling; music phrase probabilistic modeling.
Ensembles (Bagging, Boosting, Stacking)	Advanced	Combine multiple models to improve accuracy/robustness; stacking trains a meta‑learner on base predictions.	Production ML pipelines; AutoML backbones; competition-winning ensembles; robust risk models; hybrid model deployment.

Neural Network Models

Model	Level	Description	Common Uses / Example Projects
Perceptron / MLP	Beginner	Fully connected networks; MLPs learn non‑linear mappings using dense layers and activations.	Tabular prediction with neural nets; baseline classifier/regressor; basic autoencoder; educational walkthrough of backpropagation; deploy a simple MLP API.
Convolutional Neural Network (CNN)	Intermediate	Convolutions detect local spatial patterns; pooling and hierarchical features make them ideal for images and spectrograms.	Image classification (imagenet, custom), audio spectrogram classification (raga/instrument), OCR for notation, transfer learning for small datasets, feature visualization.
ResNet / VGG / EfficientNet	Advanced	Variants of CNNs addressing depth, efficiency and training stability (residual connections, scaling).	Medical imaging pipelines, object detection backbones, model compression experiments, transfer learning for niche image tasks, performance benchmarking.
RNN / LSTM / GRU	Intermediate	Recurrent networks capture temporal dependencies; LSTM/GRU improve long‑term memory and training stability.	Melody generation, next‑note prediction, sequence labeling of musical events, BPM/time‑series modeling, compare RNN vs Transformer for sequences.
Seq2Seq (Encoder‑Decoder)	Advanced	Maps input sequences to output sequences; often augmented with attention for alignment (used in translation, summarization).	Notation‑to‑audio pipelines, music transcription, melody harmonization, automated lyric translation, guided sequence transformation.
Autoencoder (AE)	Intermediate	Compresses inputs to a latent representation and reconstructs them—useful for denoising and feature learning.	Denoise audio, compress notation, anomaly detection in recordings, pretrain encoders for downstream tasks, latent space visualization.
Variational Autoencoder (VAE)	Advanced	Probabilistic autoencoder enabling sampling from latent distributions for generative tasks.	Generate melody variations, interpolate between motifs, conditional generation by raga tag, data augmentation, latent‑space exploration tools.
GAN (Generative Adversarial Network)	Advanced	Adversarial training of a generator and a discriminator to produce realistic samples.	Create new audio textures, style transfer between genres, album-art generation, augment small datasets, timbre conversion experiments.
Transformer (Self‑Attention)	Advanced	Self‑attention models that capture pairwise interactions across sequences in parallel; foundation of modern LLMs.	Implement toy transformer, pretrain on music tokens, melody autocomplete, attention analysis for musical motifs, fine‑tune for chord prediction.

Transformers & Large Language Models (LLMs)

Model	Level	Description	Common Uses / Example Projects
BERT / RoBERTa / ALBERT	Advanced	Encoder‑only transformers pretrained with masked language modeling for contextual understanding.	NER for music corpora, semantic search over bandishes, classification of lyrics, embedding extraction for similarity, fine‑tune for metadata tagging.
GPT (GPT‑1 → GPT‑5, LLaMA, Mistral, Falcon)	Expert	Decoder‑only autoregressive transformers trained to predict next tokens; excel at generation and instruction following.	Interactive composition assistant, practice-plan generator, notation-to-text converter, domain‑fine‑tuned tutor for Hindustani music, creative lyric/melody co‑authoring.
T5 / BART	Advanced	Encoder‑decoder text‑to‑text frameworks useful for any sequence transformation task.	Summarization of lecture notes, paraphrasing bandish descriptions, automated notation normalization, question generation, lyric rewriting.
Vision Transformer (ViT) / CLIP / BLIP	Advanced	Apply transformer to visual tokens; CLIP aligns images and text for cross‑modal tasks.	Sheet‑image captioning, cross‑modal search (audio ↔ sheet), visual notation classification, image‑based dataset indexing, caption generation for concerts.
Audio & Music Models (Whisper, Wav2Vec2, MusicLM, Jukebox)	Advanced	Transformers tailored for audio tasks: ASR, speech embeddings, and music generation/synthesis.	Transcribe bandish recordings, extract melody embeddings, generate accompaniment, timbre conversion, build practice feedback systems.

Graph & Relational Models

Model	Level	Description	Common Uses / Example Projects
Graph Neural Network (GNN) / GCN	Advanced	Message‑passing networks that learn from node/edge structure and propagate features across graphs.	Raag knowledge graph embeddings (Neo4j + GNN), phrase link prediction, recommendation by graph proximity, molecular property prediction, artist collaboration networks.
GraphSAGE / GAT	Advanced	Scalable neighborhood sampling (GraphSAGE) and attention‑based graph message weighting (GAT).	Inductive node classification, influencer detection, weighted relation modeling, music community detection, graph-based retrieval.
Knowledge Graph Embeddings (TransE, RotatE, ComplEx)	Advanced	Embed entities & relations into vector spaces preserving relational structure for reasoning and retrieval.	Semantic search over bandishes, QA over structured music facts, link completion for missing relations, hybrid RAG pipelines, ontology alignment.

Reinforcement Learning

Model	Level	Description	Common Uses / Example Projects
Q‑Learning	Intermediate	Value‑based RL that learns state–action values for discrete action problems.	Grid‑world agents, discrete music game (hit correct beat), pathfinding, policy visualization, basic RL education.
Deep Q‑Network (DQN)	Advanced	Neural network approximates Q function for high‑dimensional observations (images, spectrograms).	Atari benchmark agents, rhythm game agent, simulated instrument control, RL curriculum experiments, replay buffer studies.
Policy Gradient / Actor‑Critic / PPO / SAC	Advanced	Directly optimize policy (stochastic) and combine value estimators for stability; PPO is widely used in practice.	Continuous instrument control, expressive performance optimization, simulated conductor, robotics finger control, reward shaping experiments.
AlphaZero / MuZero	Expert	Combine deep RL with MCTS and learned dynamics for planning and strategy learning.	Game-playing agents (chess/go), planning in music composition search, high‑level strategy simulators, research on sample‑efficiency and planning.

Generative & Diffusion Models

Model	Level	Description	Common Uses / Example Projects
Variational Autoencoder (VAE)	Advanced	Latent probabilistic model enabling sampling and interpolation between encoded inputs.	Generate melody variants, interpolate between bandishes, conditional VAE by raga, data augmentation, latent visualization tools.
GANs (DCGAN / StyleGAN / CycleGAN)	Advanced	Adversarial generator/discriminator pairs for realistic sample synthesis or domain translation.	Style transfer for musical art, generate album art, audio texture synthesis, convert folk ↔ classical styles, GAN augmentation pipelines.
Diffusion Models (DDPM / DDIM / Latent Diffusion / Stable Diffusion)	Expert	Iterative denoising from noise to data; excels at high-fidelity image, audio and video generation.	Text‑to‑image concert visuals, spectrogram diffusion for audio gen, conditional diffusion for melody→audio, research on controllable synthesis, multimodal diffusion experiments.

Hybrid, Symbolic & Agentic Systems

Model / Pattern	Level	Description	Common Uses / Example Projects
Retrieval‑Augmented Generation (RAG)	Advanced	Combine LLMs with external retrievers (vector DBs, knowledge graphs) to ground responses and reduce hallucination.	Bandish QA chatbot with citations, tutor system linked to raag KG, enterprise document assistants, hybrid RAG (vectors + Neo4j), provenance tracking demos.
Mixture of Experts (MoE)	Expert	Large-scale models with sparse routing to specialized experts to scale capacity efficiently.	Experiment with expert routing for musical domains, MoE toy for prompt routing, study efficiency vs dense baselines, adapt experts to ragas, research in MoE stability.
Neuro‑Symbolic & Knowledge Graph Integration	Expert	Combines symbolic reasoning (graphs, logic) with neural networks for explainable, grounded AI.	Hybrid QA systems, rule + NN compositional models, knowledge‑grounded tutoring, graph reasoning pipelines for music ontology, explainable recommendation systems.
Agentic Multi‑Agent Systems	Expert	Orchestration of multiple agents (LLMs) to plan, act, and collaborate for complex tasks and workflows.	Practice coach agents, multi‑agent composition systems, dataset curation agents, long‑term tutoring with memory, autonomous research assistants.

Tools, Libraries & Platforms

Common tools you’ll use across models:

Area	Tools / Libraries
General ML	scikit‑learn, pandas, numpy, scipy
Deep Learning	PyTorch, TensorFlow, Keras, JAX
Transformers & LLMs	Hugging Face Transformers, 🤗Datasets, PEFT, LoRA, HuggingFace Hub
GNN	PyTorch Geometric, DGL
Retrieval & Vectors	FAISS, Milvus, Annoy, Pinecone, Weaviate, LlamaIndex, LangChain
RL	Gymnasium, Stable Baselines3, RLlib
Generative	diffusers, Jukebox, Magenta, WaveNet, torchaudio
Deployment	Docker, FastAPI, TorchServe, BentoML, Kubernetes
MLOps	Weights & Biases, MLflow, TensorBoard
Graph DB	Neo4j, TigerGraph