Table of Contents
< All Topics

AI Model Compendium

COMPLETE AI MODEL REFERENCE — Full

COMPLETE AI MODEL REFERENCE

Concise descriptions, difficulty level, typical uses and example projects for major AI, ML and deep learning models (comprehensive 2025‑level list).

Classical Machine Learning Models

ModelLevelDescriptionCommon Uses / Example Projects
Linear RegressionBeginnerPredict continuous targets via linear combination of features; teaches OLS and gradients.House price prediction; sales/time-series forecasting; energy consumption modeling; baseline regression experiments; feature selection studies.
Logistic RegressionBeginnerBinary classification using sigmoid; outputs probabilities and interpretable coefficients.Spam detection; medical screening; churn prediction; credit default classification; simple NLP classification with bag‑of‑words.
Decision TreeBeginnerHierarchical splits on features producing human‑readable rules; easy to visualize.Credit scoring rules; diagnostic flowcharts; interpretable classification demos; feature importance visualizer; teaching decision logic.
Random ForestIntermediateEnsemble of randomized trees; reduces variance and overfitting via averaging.Tabular baseline for industry problems; feature importance reports; anomaly detection; ecology / bioinformatics classification; model stacking component.
Gradient Boosting (XGBoost / LightGBM / CatBoost)IntermediateSequentially built trees that focus on correcting prior errors; state‑of‑the‑art for tabular tasks.Kaggle‑style tabular pipelines; credit risk scoring; demand forecasting; click‑through rate prediction; categorical-heavy datasets.
Support Vector Machine (SVM)IntermediateFinds maximum margin hyperplane; kernel trick enables non-linear boundaries.Text categorization with TF‑IDF; face detection; small-sample classification; kernel comparison experiments.
K-Nearest Neighbors (KNN)BeginnerInstance-based classifier using nearest labeled neighbors; simple but powerful for small datasets.Recommender prototypes; handwriting recognition demos; local anomaly detection; content-based filtering.
K‑MeansBeginnerPartitions data into k clusters by minimizing within‑cluster variance.Customer segmentation; color quantization for images; document clustering; music‑phrase grouping; prototyping for active learning.
DBSCANIntermediateDensity‑based clustering that finds arbitrary shapes and marks noise/outliers.Geospatial clustering; anomaly detection on sensor streams; clustering noisy audio features; density-based segmentation.
PCA (Principal Component Analysis)IntermediateLinear dimensionality reduction projecting data onto principal variance directions.Feature compression, visualization, denoising, pre-processing for downstream models, exploratory analysis of embeddings.
t‑SNE / UMAPIntermediateNonlinear projection methods for visualizing high‑dimensional data in 2D/3D.Visualize raag/phrase embeddings; cluster structure exploration; embedding space debugging; model representation comparisons.
Gaussian Mixture Model (GMM)IntermediateProbabilistic clustering using mixtures of Gaussians; gives soft cluster assignments.Speaker diarization prototypes; density estimation; unsupervised phoneme modeling; music phrase probabilistic modeling.
Ensembles (Bagging, Boosting, Stacking)AdvancedCombine multiple models to improve accuracy/robustness; stacking trains a meta‑learner on base predictions.Production ML pipelines; AutoML backbones; competition-winning ensembles; robust risk models; hybrid model deployment.

Neural Network Models

ModelLevelDescriptionCommon Uses / Example Projects
Perceptron / MLPBeginnerFully connected networks; MLPs learn non‑linear mappings using dense layers and activations.Tabular prediction with neural nets; baseline classifier/regressor; basic autoencoder; educational walkthrough of backpropagation; deploy a simple MLP API.
Convolutional Neural Network (CNN)IntermediateConvolutions detect local spatial patterns; pooling and hierarchical features make them ideal for images and spectrograms.Image classification (imagenet, custom), audio spectrogram classification (raga/instrument), OCR for notation, transfer learning for small datasets, feature visualization.
ResNet / VGG / EfficientNetAdvancedVariants of CNNs addressing depth, efficiency and training stability (residual connections, scaling).Medical imaging pipelines, object detection backbones, model compression experiments, transfer learning for niche image tasks, performance benchmarking.
RNN / LSTM / GRUIntermediateRecurrent networks capture temporal dependencies; LSTM/GRU improve long‑term memory and training stability.Melody generation, next‑note prediction, sequence labeling of musical events, BPM/time‑series modeling, compare RNN vs Transformer for sequences.
Seq2Seq (Encoder‑Decoder)AdvancedMaps input sequences to output sequences; often augmented with attention for alignment (used in translation, summarization).Notation‑to‑audio pipelines, music transcription, melody harmonization, automated lyric translation, guided sequence transformation.
Autoencoder (AE)IntermediateCompresses inputs to a latent representation and reconstructs them—useful for denoising and feature learning.Denoise audio, compress notation, anomaly detection in recordings, pretrain encoders for downstream tasks, latent space visualization.
Variational Autoencoder (VAE)AdvancedProbabilistic autoencoder enabling sampling from latent distributions for generative tasks.Generate melody variations, interpolate between motifs, conditional generation by raga tag, data augmentation, latent‑space exploration tools.
GAN (Generative Adversarial Network)AdvancedAdversarial training of a generator and a discriminator to produce realistic samples.Create new audio textures, style transfer between genres, album-art generation, augment small datasets, timbre conversion experiments.
Transformer (Self‑Attention)AdvancedSelf‑attention models that capture pairwise interactions across sequences in parallel; foundation of modern LLMs.Implement toy transformer, pretrain on music tokens, melody autocomplete, attention analysis for musical motifs, fine‑tune for chord prediction.

Transformers & Large Language Models (LLMs)

ModelLevelDescriptionCommon Uses / Example Projects
BERT / RoBERTa / ALBERTAdvancedEncoder‑only transformers pretrained with masked language modeling for contextual understanding.NER for music corpora, semantic search over bandishes, classification of lyrics, embedding extraction for similarity, fine‑tune for metadata tagging.
GPT (GPT‑1 → GPT‑5, LLaMA, Mistral, Falcon)ExpertDecoder‑only autoregressive transformers trained to predict next tokens; excel at generation and instruction following.Interactive composition assistant, practice-plan generator, notation-to-text converter, domain‑fine‑tuned tutor for Hindustani music, creative lyric/melody co‑authoring.
T5 / BARTAdvancedEncoder‑decoder text‑to‑text frameworks useful for any sequence transformation task.Summarization of lecture notes, paraphrasing bandish descriptions, automated notation normalization, question generation, lyric rewriting.
Vision Transformer (ViT) / CLIP / BLIPAdvancedApply transformer to visual tokens; CLIP aligns images and text for cross‑modal tasks.Sheet‑image captioning, cross‑modal search (audio ↔ sheet), visual notation classification, image‑based dataset indexing, caption generation for concerts.
Audio & Music Models (Whisper, Wav2Vec2, MusicLM, Jukebox)AdvancedTransformers tailored for audio tasks: ASR, speech embeddings, and music generation/synthesis.Transcribe bandish recordings, extract melody embeddings, generate accompaniment, timbre conversion, build practice feedback systems.

Graph & Relational Models

ModelLevelDescriptionCommon Uses / Example Projects
Graph Neural Network (GNN) / GCNAdvancedMessage‑passing networks that learn from node/edge structure and propagate features across graphs.Raag knowledge graph embeddings (Neo4j + GNN), phrase link prediction, recommendation by graph proximity, molecular property prediction, artist collaboration networks.
GraphSAGE / GATAdvancedScalable neighborhood sampling (GraphSAGE) and attention‑based graph message weighting (GAT).Inductive node classification, influencer detection, weighted relation modeling, music community detection, graph-based retrieval.
Knowledge Graph Embeddings (TransE, RotatE, ComplEx)AdvancedEmbed entities & relations into vector spaces preserving relational structure for reasoning and retrieval.Semantic search over bandishes, QA over structured music facts, link completion for missing relations, hybrid RAG pipelines, ontology alignment.

Reinforcement Learning

ModelLevelDescriptionCommon Uses / Example Projects
Q‑LearningIntermediateValue‑based RL that learns state–action values for discrete action problems.Grid‑world agents, discrete music game (hit correct beat), pathfinding, policy visualization, basic RL education.
Deep Q‑Network (DQN)AdvancedNeural network approximates Q function for high‑dimensional observations (images, spectrograms).Atari benchmark agents, rhythm game agent, simulated instrument control, RL curriculum experiments, replay buffer studies.
Policy Gradient / Actor‑Critic / PPO / SACAdvancedDirectly optimize policy (stochastic) and combine value estimators for stability; PPO is widely used in practice.Continuous instrument control, expressive performance optimization, simulated conductor, robotics finger control, reward shaping experiments.
AlphaZero / MuZeroExpertCombine deep RL with MCTS and learned dynamics for planning and strategy learning.Game-playing agents (chess/go), planning in music composition search, high‑level strategy simulators, research on sample‑efficiency and planning.

Generative & Diffusion Models

ModelLevelDescriptionCommon Uses / Example Projects
Variational Autoencoder (VAE)AdvancedLatent probabilistic model enabling sampling and interpolation between encoded inputs.Generate melody variants, interpolate between bandishes, conditional VAE by raga, data augmentation, latent visualization tools.
GANs (DCGAN / StyleGAN / CycleGAN)AdvancedAdversarial generator/discriminator pairs for realistic sample synthesis or domain translation.Style transfer for musical art, generate album art, audio texture synthesis, convert folk ↔ classical styles, GAN augmentation pipelines.
Diffusion Models (DDPM / DDIM / Latent Diffusion / Stable Diffusion)ExpertIterative denoising from noise to data; excels at high-fidelity image, audio and video generation.Text‑to‑image concert visuals, spectrogram diffusion for audio gen, conditional diffusion for melody→audio, research on controllable synthesis, multimodal diffusion experiments.

Hybrid, Symbolic & Agentic Systems

Model / PatternLevelDescriptionCommon Uses / Example Projects
Retrieval‑Augmented Generation (RAG)AdvancedCombine LLMs with external retrievers (vector DBs, knowledge graphs) to ground responses and reduce hallucination.Bandish QA chatbot with citations, tutor system linked to raag KG, enterprise document assistants, hybrid RAG (vectors + Neo4j), provenance tracking demos.
Mixture of Experts (MoE)ExpertLarge-scale models with sparse routing to specialized experts to scale capacity efficiently.Experiment with expert routing for musical domains, MoE toy for prompt routing, study efficiency vs dense baselines, adapt experts to ragas, research in MoE stability.
Neuro‑Symbolic & Knowledge Graph IntegrationExpertCombines symbolic reasoning (graphs, logic) with neural networks for explainable, grounded AI.Hybrid QA systems, rule + NN compositional models, knowledge‑grounded tutoring, graph reasoning pipelines for music ontology, explainable recommendation systems.
Agentic Multi‑Agent SystemsExpertOrchestration of multiple agents (LLMs) to plan, act, and collaborate for complex tasks and workflows.Practice coach agents, multi‑agent composition systems, dataset curation agents, long‑term tutoring with memory, autonomous research assistants.

Tools, Libraries & Platforms

Common tools you’ll use across models:

AreaTools / Libraries
General MLscikit‑learn, pandas, numpy, scipy
Deep LearningPyTorch, TensorFlow, Keras, JAX
Transformers & LLMsHugging Face Transformers, 🤗Datasets, PEFT, LoRA, HuggingFace Hub
GNNPyTorch Geometric, DGL
Retrieval & VectorsFAISS, Milvus, Annoy, Pinecone, Weaviate, LlamaIndex, LangChain
RLGymnasium, Stable Baselines3, RLlib
Generativediffusers, Jukebox, Magenta, WaveNet, torchaudio
DeploymentDocker, FastAPI, TorchServe, BentoML, Kubernetes
MLOpsWeights & Biases, MLflow, TensorBoard
Graph DBNeo4j, TigerGraph