About

This tool was created for CEE 4803 (Art & Generative AI) at the Georgia Institute of Technology.

It pairs with our Libre textbook, AI Fundamentals.

Main Developer:

Kenneth (Alex) Jenkins

Overseeing Professor:

Dr. Francesco Fedele

Source Code:

ML Visualizer (GPLv3)

ML Visualizer

Interactive demonstrations of AI and machine learning architectures

Perceptron

The foundational building block of neural networks - a simple linear classifier

Supervised Classification
🧠

Deep Perceptron (MLP)

Multi-layer neural network with hidden layers for complex decision boundaries

Supervised Deep Learning
🧲

Ising Model

Statistical physics model showing spin interactions and phase transitions

Physics Energy-based
💾

Hopfield Network

Associative memory network that stores and retrieves patterns

Recurrent Memory
📦

Autoencoder

Compress and reconstruct data through a bottleneck latent representation

Unsupervised Compression
🔄

Restricted Boltzmann Machine

A generative network with bidirectional energy flow between visible and hidden layers

Unsupervised Generative
🎲

Variational Autoencoder

Probabilistic encoder-decoder that learns continuous latent distributions

Generative Probabilistic
🌊

Normalizing Flow

Invertible transformations that map simple to complex distributions

Generative Probabilistic
🖼️

CNN Encoder-Decoder

Bidirectional convolutional network for image processing tasks

Computer Vision Spatial
🤖

Transformer

Modern architecture using attention mechanisms to process sequences

Attention NLP
🐍

Mamba2

State-of-the-art architecture replacing Transformers with linear-time complexity for ultra-long sequences

State-of-the-art Efficient

CUDA Visualization

GPU parallel computing concepts: threads, blocks, and memory hierarchy

GPU Parallel
🎵

Neural Music Generator

Transformer-based architecture that generates original melodies using self-attention mechanisms and sequential pattern learning

Transformer Attention Generative Audio

Perceptron

How it works:

  • Inputs: Different pieces of information (like the checklist items)
  • Weights: How important each piece of information is
  • Sum It Up: Multiply each input by its weight and add them together
  • Make Decision: If the total is high enough, say "yes", otherwise "no"
  • Learn: When wrong, adjust the weights to get better next time

A perceptron is like a simple decision-maker with a checklist. Imagine a bouncer at a party who decides if you can come in based on a few things: Are you on the guest list? Are you wearing nice shoes? Do you have an invitation? The bouncer gives each rule a different importance (weight), adds up the scores, and makes a final yes/no decision. When the bouncer makes mistakes, they learn by adjusting how important each rule is!

Input Features Output Decision

Restricted Boltzmann Machine (RBM)

How it works:

  • Visible Layer: The data we can see (like pixels in an image)
  • Hidden Layer: Secret patterns the machine discovers (like "pointy ears" or "whiskers")
  • Two-Way Learning: Information flows both directions, like a conversation
  • Restricted: Switches only talk between layers, not within their own layer
  • Uses: Learning patterns, recommender systems, filling in missing data

An RBM is like a box full of light switches that can turn on or off. Some switches are on the front (visible units - things we can see) and some are hidden inside (hidden units - patterns we discover). The cool part? The switches talk to each other! If you flip the visible switches in a certain pattern (like showing it a picture of a cat), the hidden switches learn to recognize "cat-ness". Then you can flip the hidden switches and the visible ones will show you a new cat picture!

Visible Layer Hidden Layer

Autoencoder

How it works:

  • Input: The original information (like a full lecture)
  • Encoder: Squishes it down to the most important parts (taking notes)
  • Latent Space: The compressed, super-important information (your notes)
  • Decoder: Expands the notes back to full size (studying from notes)
  • Output: Reconstructed version that should match the input
  • Uses: Image compression, noise removal, anomaly detection

An autoencoder is like a really clever note-taker in class. Instead of writing down everything the teacher says word-for-word, they write short notes with just the most important ideas (encoding). Later, when studying for the test, they can expand those short notes back into full explanations (decoding). If the notes are good, you can recreate almost the whole lecture! The autoencoder learns what's "important enough" to write down to recreate the original information.

Input Encoder Latent Decoder Output

Ising Model

How it works:

  • Spins: Each cell is a tiny magnet pointing up (white) or down (black)
  • Neighbors: Magnets talk to their neighbors and try to match them
  • Energy: System is "happy" (low energy) when neighbors match
  • Temperature: How random the magnets are (cold = organized, hot = chaotic)
  • Phase Transition: Watch order emerge from chaos as temperature changes!
  • Uses: Understanding magnetism, neural networks, social dynamics

The Ising Model is like a checkerboard where each square is a tiny magnet that wants to point either up or down. Here's the cool part: each magnet wants to match its neighbors - if your neighbor points up, you want to point up too! Temperature is like how much the magnets "wiggle around". When it's cold, all the magnets line up the same way (like everyone wearing the same team jersey). When it's hot, they point randomly (like a messy crowd). This shows how simple rules create complex group behavior!

Temperature: 2.27 Energy: 0

Hopfield Network

How it works:

  • Fully Connected: Every neuron talks to every other neuron (like a group chat)
  • Store Memories: Patterns are saved by adjusting connection strengths
  • Pattern Recall: Show a damaged pattern, get the complete memory back
  • Energy Landscape: Memories are valleys; the network rolls into the nearest one
  • Robust: Works even with noisy or incomplete inputs
  • Uses: Pattern recognition, memory restoration, optimization problems

A Hopfield Network is like a magic photo restoration machine. Imagine you have a damaged old photograph with parts missing or blurry. You show it to the network, and it "remembers" complete photos it saw before and fixes your damaged one! It's like when you see half a face and your brain fills in the rest. The network stores memories as patterns, and when you give it a partial or noisy pattern, it rolls downhill into the closest complete memory - like a ball rolling into a valley!

Current State Stored Patterns: 0

Transformer

How it works:

  • Next-token prediction: Given some text (tokens), it predicts a probability for the next token
  • Pick a token: Choose the most likely token (argmax) or sample using temperature/top-k
  • Repeat: Append that token and predict the next one again — this is how generation works
  • Under the hood: Self‑attention builds a context-aware representation to make that one prediction better
  • That’s it: Transformers are trained to be excellent next‑token predictors

In practice, a Transformer takes the tokens you’ve typed and outputs a probability distribution over the next token. Pick one, append it, and repeat. Internally, self‑attention helps the model use relevant context to make that single step as accurate as possible. Chat, translation, and code generation are just this next‑token game played many times.

Prompt Context Scores Softmax Next Token
Top predictions

    Deep Perceptron (MLP)

    How it works:

    • Input Layer: Takes in raw data (pixels, numbers, features)
    • Hidden Layers: Each layer learns increasingly complex patterns
    • Neurons: Like tiny decision-makers that add up all their inputs
    • Activation Functions: Add non-linearity so network can learn curves, not just straight lines
    • Backpropagation: When wrong, learns by adjusting all decisions backwards
    • Uses: Image recognition, spam detection, credit scoring, recommendation systems

    A Deep Perceptron (MLP) is like a tower of smart committees, each one making decisions based on what the committee below figured out! The first committee looks at raw information (like pixels in a photo). The second committee looks at patterns the first one found (like "edges"). The third committee spots bigger patterns (like "circles" or "corners"). Each committee learns what's important and passes it up. By the time you reach the top, the network can recognize really complex things like "this is a cat" or "this email is spam!" It's like playing telephone, but each person in line makes the message smarter instead of more confusing.

    INPUT HIDDEN 1 HIDDEN 2 HIDDEN 3 OUTPUT

    Normalizing Flow

    How it works:

    • Base Distribution: Starts with simple randomness (like a ball of Play-Doh)
    • Transformation Layers: Each layer warps/twists the data in a reversible way
    • Invertible: Can run forwards (create data) or backwards (analyze data)
    • Exact Probabilities: Knows how likely any particular output is
    • Flow: Data "flows" through transformations like water through pipes
    • Uses: Generating realistic images/audio, density estimation, anomaly detection

    A Normalizing Flow is like a Play-Doh factory that can make ANY shape you want! You start with a simple ball of Play-Doh (easy to make). Then you push it through a series of special molds - twist here, stretch there, bend this way. Each mold transforms it step-by-step. The cool part? You can write down EXACTLY what each mold does, so you can reverse the whole process perfectly! If you want another copy, just start with a ball and push through the same molds. This is how AI can create realistic faces, voices, or artwork - it learns what "molds" (transformations) turn random noise into the real thing.

    Base Z Flow 1 Flow 2 Flow 3 Data X

    Variational Autoencoder (VAE)

    How it works:

    • Encoder: Learns to describe inputs as a range (not one exact point)
    • Latent Space: A "map" where similar things are close together
    • Sampling: Picks a point in that range (adds controlled randomness)
    • Decoder: Turns that point back into a full image/data
    • Training: Learns to recreate inputs while keeping latent space organized
    • Uses: Generating new faces, drug discovery, music composition, image interpolation

    A VAE is like an artist who doesn't trace drawings exactly - they learn the STYLE! Imagine showing an artist 1000 cat photos. Instead of memorizing each cat, they learn "cats usually have pointy ears, whiskers, and round eyes - but every cat is slightly different." Now when you ask them to draw a new cat, they don't copy an old photo; they create a unique cat using what they learned about "cat-ness." The magic is they also learn WHERE in "cat space" each feature lives (fluffy vs. short-hair, big vs. small), so you can even say "draw me a cat that's halfway between these two!" This is how AI generates new faces, art, or music that look real but never existed before.

    Input μ, σ z ~ N(μ,σ) Decoder Output

    CNN Encoder-Decoder

    How it works:

    • Encoder: Shrinks the image while capturing important patterns (like edges and shapes)
    • Bottleneck: The smallest, most compressed version with just the essential information
    • Decoder: Expands it back to full size, adding details back in
    • Skip Connections: Shortcuts that help remember fine details from the original
    • Uses: Removing backgrounds, enhancing photos, medical image analysis

    Imagine you're looking at a photo through a magnifying glass that first makes everything blurry and simple (like squinting your eyes), then gradually brings back all the details. The CNN Encoder-Decoder is like a smart camera that first simplifies an image by focusing on the most important shapes and patterns, then rebuilds it with all the details restored - like taking a puzzle apart and putting it back together, but now you understand every piece!

    Input Conv↓ Features Conv↑ Output

    Mamba2 Architecture

    How it works:

    • Replacing Transformers: Mamba2 is a breakthrough architecture that solves the main limitation of Transformers - their quadratic complexity with sequence length
    • Linear Complexity: While Transformers slow down dramatically with long sequences (computing attention between all token pairs), Mamba2 maintains constant speed regardless of length
    • Selective State Spaces: Uses a compact memory that intelligently decides what information to retain and what to forget, unlike Transformers that must attend to everything
    • Hardware Optimized: Designed from the ground up for modern GPU architectures, achieving 5-10x faster inference than Transformers
    • State-of-the-Art Performance: Matches or exceeds Transformer quality while handling 10x longer contexts (millions of tokens vs. hundreds of thousands)
    • The Future: Being rapidly adopted for long-document understanding, genomics, time-series forecasting, and next-generation language models
    • Why It Matters: Enables AI to process entire books, codebases, or conversations in a single pass - something Transformers struggle with

    Think of Transformers as a student who needs to compare every word in a book with every other word to understand it - this gets impossibly slow with long books! Mamba2 is like a speed-reader with a smart notebook: as it reads, it instantly decides "This is important, write it down" or "This is background info, skip it." The notebook stays small and organized, so even with massive texts, Mamba2 reads at lightning speed. This breakthrough is why Mamba2 is rapidly replacing Transformers in applications that need to understand really long sequences - from analyzing entire research papers to processing hours of conversation history.

    Input State Gates Updated State Output

    CUDA

    How it works:

    • GPU Cores: Thousands of mini-processors working simultaneously
    • Thread Blocks: Groups of workers that share information quickly
    • Parallel Threads: Individual workers each doing one small task
    • Shared Memory: A fast whiteboard for each group to share notes
    • Global Memory: The big storage room everyone can access (but it's slower)
    • Uses: Training neural networks, graphics rendering, scientific simulations

    Imagine you have a big homework assignment with 1000 math problems. Your regular CPU is like having ONE really smart student who solves each problem one at a time - fast, but it takes a while. A GPU with CUDA is like having a classroom with THOUSANDS of students who each solve one problem at the same time! Even though each student might be a bit slower than the super-smart one, when they all work together, they finish the whole assignment way faster. That's why GPUs are perfect for training AI!