Argentinian Insult Generator
A humorous NLP experiment created during the 2022 World Cup. Generates unique, culturally accurate Argentinian insults using Markov chains and Twitter data.

200+
Insults in DB
1 🇦🇷
World Cup Wins
# Overview
Created specifically for the Qatar 2022 World Cup craze, this project started as a fun experiment to capture the passion and linguistic creativity of Argentinian football fans.
We scraped and curated a dataset of over 200 authentic insults from Twitter during the matches. The system uses these samples to train a lightweight language model and Markov chains, capable of generating new, grammatically coherent (and hilarious) combinations that sound entirely authentic.
It became a viral hit among friends during Argentina's run to the championship.
Problem
During the World Cup, we wanted to celebrate the unique 'folklore' of Argentinian football fandom, specifically their creative use of language in banter, but we wanted to do it through code and automation.
Solution
I built a web application backed by a probabilistic text generation model. By feeding it a curated dataset of real tweets, the algorithm learns word probabilities and sentence structures to construct new, never-before-seen insults that maintain the specific cadence and slang of the region.
# Features
- Custom NLP engine based on Markov Chains
- Curated dataset of 200+ authentic slang tweets
- Instant text generation with 'Copy to Clipboard'
- Lightweight, server-side rendering for speed
- Minimalist, football-themed UI
# Screenshots

Minimalist interface generating unique combinations of slang.

An example of a generated phrase capturing the local dialect.
Challenges
- >Cleaning and normalizing Twitter data (slang, typos, abbreviations) to ensure coherent output.
- >Tuning the Markov chain 'state size' to balance between copying phrases and generating nonsense.
- >Deploying a Python application with minimal latency on a free-tier PaaS.
Learnings
- >Fundamentals of Natural Language Processing and probabilistic models.
- >Techniques for web scraping and data sanitation.
- >Deployment workflows with Docker and Fly.io.
- >How simple algorithms can effectively mimic complex cultural linguistic patterns.
Tech Stack
NLP/AI
Backend
Data
Deployment
Collaborators
Quick Info
Status
live
Started
November 2022
Completed
December 2022