AI Landscape Field Guide 2025-2026

Chapter 01

The Core Concepts - What Is Everything?

Before diving into the model table, you need to understand what all these different things actually are. Here's what each one means, explained in plain English.

01

Foundation

LLM - Large Language Model

The actual AI brain - a mathematical system with billions of parameters trained on vast amounts of text. It predicts the next word, one at a time, to form coherent responses.

Analogy: A brain that has read every book in every library and can write fluently on any topic.

02

Versions

Model

A specific trained and versioned instance of an LLM - "Claude Sonnet 4.6", "GPT-4o", "Llama 3.3 70B". The "70B" means 70 billion parameters.

Analogy: If LLM is an engine technology, a specific model is a specific car - same family, different specs.

03

Infrastructure

Engine / Runtime

The software that actually runs a model - loads the file, manages memory, handles requests. The model file itself is inert without an engine.

Analogy: The car's drivetrain and electronics. The model is the vehicle body. Both needed to move.

04

Distribution

Open Source vs Closed

Open-source: you can download, self-host, modify for free (Llama, Mistral). Closed: access only via API or chat (GPT-4o, Claude, Gemini).

Analogy: Open = full recipe. Closed = order food from their restaurant - kitchen is off-limits.

05

Specialization

Multimodal & Specialized

Multimodal models process text + images/audio/video. Plus coding models, image generation, speech, embedding models for AI-powered search.

Analogy: Hiring specialists - general contractor vs electrician vs plumber. Right tool for the job.

06

Access

API - Application Programming Interface

Programmatic way to talk to a model. When developers build apps, they call the API directly - scalable, without a chat interface.

Analogy: Chat = walking into a restaurant. API = calling the kitchen directly - faster and programmable.

Chapter 02

How Does an LLM Actually Work?

A plain English walkthrough of exactly what happens between you pressing Send and the AI responding.

Your prompt→Tokenization→Embedding→Transformer layers→Token prediction→Repeat→Response

01 · Tokenization

Your text gets chopped up

Your message is split into tokens (≈3-4 characters each). AI models have token limits - their "context window".

02 · Embedding

Words become numbers

Each token becomes a vector of numbers encoding meaning. Similar words get mathematically close vectors.

03 · Attention

Every word watches every other

Through transformer layers, every token "attends" to every other - understanding references and context.

04 · Prediction

Pick the next likely word

The model produces a probability distribution and samples the next token. Randomness gives variety.

05 · Repeat

Do it again, and again

Each new token is fed back in. A 200-word response takes 250+ prediction cycles, milliseconds each.

06 · Fine-tuning

How it learns to be helpful

RLHF (Reinforcement Learning from Human Feedback) trains the model to produce highly-rated responses.

Note

Parameters, weights, model size - all refer to the same thing: the billions of numbers inside a model. More parameters = more capability, but also more memory and compute. A 7B model runs on a laptop; 70B needs a gaming PC; 405B needs a server.

So what?

This is why AI can feel brilliant one moment and confidently wrong the next. It is not thinking. It is pattern-matching at an enormous scale. Understanding that changes how you use it: always verify anything important, and treat it like a very well-read assistant, not an oracle.

Quick Start

Not sure where to begin?

With 58 models out there it is easy to feel lost. Here is the honest short answer based on what you actually want to do.

I just want to chat with AI

Start with Claude or ChatGPT. Both are free to try, no setup needed. Claude tends to be better for writing and nuanced conversations. ChatGPT has a larger ecosystem of plugins.

I want to generate images

Try Midjourney for artistic, high-quality results. DALL·E 3 (inside ChatGPT) is easier for beginners. Flux is free and surprisingly powerful if you are willing to tinker.

I want to use AI for my business

Claude or ChatGPT Plus handle most business tasks: writing, summarising, drafting emails, analysing documents. No coding required. Both have file upload so you can feed them your actual documents.

I want privacy, no data sent online

Run a model locally with Ollama (free, one command to install) or LM Studio (desktop app, no terminal needed). Pair with Llama 3 or Gemma 3, both run well on a normal laptop.

I want to make music with AI

Suno generates full songs from a text prompt: pick a genre, mood, and lyrics. Free tier available. It is genuinely impressive for demos, ideas, and background tracks.

I want to experiment and learn

Start with Gemini 2.0 Flash. It has a free API, a generous limit, and supports text, images, and code. It is what most AI automation tutorials use as a starting point.

Chapter 03

Terminology Decoder

Every term you'll encounter in the AI world - decoded in plain English with examples.

LLM

Large Language Model - the AI brain trained on text.

GPT-4, Claude, Llama

Model

A specific trained and versioned LLM.

Claude Sonnet 4.6, GPT-4o

Parameters / Weights

Billions of numbers inside a model - encodes all "knowledge".

Llama 3 70B = 70 billion params

Tokens

Chunks AI reads/writes (~3-4 characters each).

"Hello world" ≈ 3 tokens

Context Window

How much text the AI can hold in memory at once.

Gemini 1.5 = 1M tokens

Inference

Running a model to generate a response.

Inference cost = compute cost

Fine-tuning

Training further on specific data to specialize.

Medical fine-tuned model

RLHF

Reinforcement Learning from Human Feedback - trains helpfulness.

What makes Claude helpful

RAG

Retrieval-Augmented Generation - looks up docs before answering.

Chatbot reading your company docs

Embedding

Converting text into vectors for search & similarity.

nomic-embed-text

Hallucination

AI confidently generates false information.

Inventing a fake citation

Multimodal

Handles text + images/audio/video.

GPT-4o, Gemini 1.5 Pro

Open Source / Open Weight

Weights are public - download and self-host for free.

Llama 3, Mistral, Gemma

Reasoning Model

"Thinks step by step" before answering - better at logic/math.

o3, DeepSeek R1, QwQ-32B

Agentic AI

Can take autonomous actions - browse web, run code, call APIs.

Claude Code, Cursor

Engine / Runtime

Software that loads and runs a model.

llama.cpp, Ollama, vLLM

Chapter 04

The Full Model Reference Table

Browse 58+ AI models. Green = free/open-source. Blue = free API tier. Amber = freemium. Gray = paid/closed..

Access Whether it costs money to use

Open Download and run it yourself for free

Local Runs on your own computer, no internet needed

Snapshot April 2026. This space moves fast.

Frontier LLM Open-Source LLM Multimodal Reasoning Coding Image Gen Audio Embedding Local Runner

58 models

← swipe to see more filters

Name	Maker	Category	What it does	Access	Open?	Local?

Understanding the AI Landscape

The Core Concepts - What Is Everything?

LLM - Large Language Model

Model

Engine / Runtime

Open Source vs Closed

Multimodal & Specialized

API - Application Programming Interface

How Does an LLM Actually Work?

Your text gets chopped up

Words become numbers

Every word watches every other

Pick the next likely word

Do it again, and again

How it learns to be helpful

Not sure where to begin?

Terminology Decoder

The Full Model Reference Table

The Honest Takeaway