Encoder–Decoder Model

Bartosz Roguski

Machine Learning Engineer

July 7, 2025

Glossary Category

LLM

Encoder–Decoder Model is a neural network architecture that converts variable-length input sequences into variable-length outputs by chaining two components: an encoder that compresses the input (text, audio, image) into a fixed-size context vector and a decoder that generates the target sequence one step at a time conditioned on that vector. Introduced for neural machine translation, it underpins tasks such as speech recognition, text summarization, and image captioning. Variants range from RNN-based seq2seq with attention to fully Transformer encoder–decoders like T5 and MarianMT. Key metrics—BLEU, ROUGE, Word-Error Rate—measure fluency and fidelity, while beam search, teacher forcing, and scheduled sampling refine training and inference. By cleanly separating understanding from generation, an Encoder–Decoder Model handles length mismatches, supports multilingual transfer, and fuels Retrieval-Augmented Generation (RAG) pipelines that plug encoder embeddings into a knowledge retriever before decoding grounded answers.

Encoder–Decoder Model

Other terms