Explore Natural Language Processing and modern language models interactively
Full hyperparameter control & visualization
Word2Vec, GloVe, and semantic spaces
Self-attention and multi-head attention
Encoder-decoder and BERT/GPT models
BPE, WordPiece, and subword tokenization
Translation and text generation
Text classification and emotion detection
Train NLP models step-by-step with full control over architecture and hyperparameters. Watch attention patterns evolve and performance improve.
Core model architecture
Lower for fine-tuning
Samples per gradient update
Training iterations
Token representation size
Multi-head attention
Regularization strength
Train Loss
0.000
Val Loss
0.000
Train Acc
0.0%
Val Acc
0.0%
BLEU Score
0.0
| Epoch | Train Loss | Val Loss | Train Acc | Val Acc | BLEU |
|---|
Scaled Dot-Product Attention:
Multi-Head Attention:
Positional Encoding:
Cross-Entropy Loss: