In 2025, Artificial Intelligence (AI) and Large Language Models (LLMs) have shifted from novelty to necessity.
According to an OpenAI working paper released in September 2025, the research and deployment company that owns ChatGPT, the platform was handling 18 billion messages each week from 700 million users by July 2025. This user base represents approximately 10% of the global adult population.
Global adoption is accelerating. In Australia, AI is rapidly embedding itself in both enterprise operations and day-to-day tools.
Understanding the core terminology behind LLMs is no longer optional — it is essential to unlocking productivity, strategic insight, and personalisation at scale.
The challenge? Bridging the knowledge gap to ensure marketing and communications professionals are making the most of these technologies.
At Ardent, we are consistently investing in that upskilling journey through internal workshops, Lunch & Learns and client collaboration. To support marketers and PR professionals alike, we have compiled a clear, concise glossary of the most important LLM and generative AI concepts.
Common Generative AI Terms and What They Mean
Use the list below to quickly jump to a term:
- Tokens
- Context Window
- Hallucinations
- Retrieval-Augmented Generation (RAG)
- Grounding
- Model Context Protocol (MCP)
1. Tokens
Tokens are the smallest units of text that LLMs process. Before a model can understand a human’s prompt input, it must break it down into a sequence of tokens through a process called tokenisation.
For example, ChatGPT Tokeniser would tokenise “Hello this is Agent Referencing into individual chunks, where some words like “Referencing” may be split into multiple tokens.


As per Google Gemini’s documentation, for Gemini models, a token is equivalent to about 4 characters. 100 tokens is equal to about 60-80 English words.
Tokens determine both processing time and cost, as well as how much information the model can consider within a single prompt.
2. Context Window
The context window defines how much information a model can “remember” in a single interaction. Technically, it refers to the number of tokens that can be considered at once.
As IBM explains, a larger context window enables a model to understand more of your prompt, reference earlier parts of the conversation, and provide richer, more accurate responses.

Modern models like GPT-4o or Claude 3 have significantly larger context windows than earlier versions, making them more effective at understanding complex inputs or full documents.
3. Hallucinations
A hallucination occurs when a model confidently generates incorrect, misleading, or entirely made-up information.
This is not a rare glitch. Hallucinations are a known limitation of LLMs, often caused by outdated training data, biased sources, or incomplete information.
A real-world example includes Deloitte Australia issuing a partial refund for a $290,000 report after it was found to contain multiple hallucinated elements — including a non-existent academic reference and a fabricated federal court quote.
Another well-documented example is from OpenAI’s Andrej Karpathy, who demonstrated a hallucination where the model provided a detailed biography of a person who does not exist.

These risks underline the importance of grounding and human review, especially when using AI for high-stakes content.
4. Retrieval-Augmented Generation (RAG)
RAG is a powerful framework that enables LLMs to fetch live, external data (e.g. from PDFs, internal documents, or a website) to generate more accurate and up-to-date responses.
Standard LLMs are limited to their training data, which can become outdated quickly. RAG extends their capabilities by allowing dynamic information retrieval, making answers more relevant and timely.

A practical use case would be a chatbot that supports internal HR queries. When an employee asks, “How much annual leave do I have?”, the system retrieves the leave policy document and the employee’s leave history. It then generates a response grounded in those specific documents.
This process relies on vector embeddings to calculate relevance — matching user queries to the most relevant source material in a mathematically precise way.
5. Grounding
Grounding is the process of anchoring an AI model’s output to verifiable data.
Instead of relying solely on what the model learned during training, grounding connects responses to external information sources, reducing the likelihood of hallucination and increasing reliability.

In practice, this might involve providing an AI assistant access to a trusted set of data — internal documentation, fact-checked websites, or structured databases. The assistant will then prioritise these sources when answering a user’s query.
Some LLMs also use grounding by accessing the web in real time.
In the example below, we can see that GPT issues actual web search queries behind the scenes (in Chrome Network Console) to verify information for your prompt.

Another example is from Google Vertex AI & how it support-model output with grounding in different ways.
| Grounding type | Description |
| Grounding with Google Search | Connect your model to world knowledge and a wide possible range of topics using Google Search. |
| Grounding with Google Maps | Use Google Maps data with your model to provide more accurate and context-aware responses to your prompts, including geospatial context. |
| Grounding with Vertex AI Search | Use retrieval-augmented generation (RAG) to connect your model to your website data or your sets of documents stored in Vertex AI Search. |
| Grounding with Vertex AI RAG Engine | Ground using your data through Vertex AI RAG Engine, which is a configurable managed RAG service. |
| Grounding with Elasticsearch | Use retrieval-augmented generation with your existing Elasticsearch indexes and Gemini. |
| Grounding with your search API | Connect Gemini to your external data sources by grounding with any search API. |
| Web Grounding for Enterprise | Use a web index suitable for highly-regulated industries to generate grounded responses with compliance controls. |
6. Model Context Protocol (MCP)
Introduced by Anthropic in late 2024, MCP is a new standard designed to help LLMs interact securely with external systems.

Think of MCP as a universal “plug” that allows AI to connect with apps, databases, or tools — much like a USB-C port. It enables LLMs to retrieve data, take action, and execute functions across different platforms.

This innovation transforms static models into dynamic agents. Instead of simply responding with text, an AI using MCP could (for example) check your calendar, fetch a live document, or trigger an automation — all securely and contextually.
MCP is a step toward the future of truly agentic AI, where models are not just chatbots but intelligent collaborators.
Final Thoughts
This glossary is not intended to be exhaustive, but rather a starting point for building clarity and confidence when working with AI. At Ardent, we believe marketers do not need to become engineers — but we do need to speak the language of the tools we use.
Understanding these concepts will help your team move from simply experimenting with AI to embedding it strategically across your marketing, comms, and CX workflows.
If there are terms you’d like us to expand on, or if you’re exploring how to apply AI and LLMs within your organisation, we’d love to help. Get in touch — and let’s continue learning together.
