Level 1 · Chapter 7.1

What AI Knows &
Doesn't Know

AI can seem omniscient. Ask it anything, and it provides confident answers. But AI has real limits. It only knows what is in its training data. It cannot access the internet. It can confidently state false information. Understanding these limits is essential for using AI responsibly.

Watch the Lecture

The Illusion of Omniscience

You ask an AI a question. Within seconds, it provides a detailed, well-reasoned, confident answer. The confidence is compelling. It cites specific facts, provides examples, explains reasoning. Surely, it knows what it is talking about.

But this confidence is deceiving. An AI trained on text from the internet learns to produce fluent, plausible-sounding text. But plausibility is not the same as truth. Understanding what an AI actually knows—the boundaries of its knowledge—is crucial for using it responsibly.

The Core Truth

AI systems only know what is in their training data. They cannot learn from the internet in real-time. They cannot access new information. They do not understand their own limitations well. When you ask about something outside their training data, they guess. And they are often confidently wrong.

Knowledge Cutoff: The Time Boundary

What Is a Knowledge Cutoff?

Every language model has a knowledge cutoff: a specific date beyond which it has no training data. For example, Claude's knowledge cutoff is April 2024. This means Claude was trained on data up to April 2024. It knows nothing about events that happened after April 2024. It does not know who won the 2024 World Series if that happened after April. It does not know about new AI announcements after April. It cannot tell you current stock prices or recent news.

This is not a limitation the AI can overcome. No matter how you phrase the question, no matter how much you ask it to "check" or "verify," the AI cannot provide information about events after its knowledge cutoff because it literally was not exposed to that information.

When Is the Cutoff Date?

Different AI systems have different cutoff dates depending on when they were trained. Always check the specific AI system you are using. Claude (as of March 2026) has an April 2024 cutoff. Other systems have different dates. Some systems with real-time access to the internet have no cutoff, but they are less common.

Implications for Your Use

If you are asking about recent events, AI is likely to give you wrong answers confidently. If you are asking about something that happened before the cutoff date (say, a company merger that happened in 2023), the AI might know about it if it was covered in the training data. If you are asking about something that happened very recently (in March 2026), AI will definitely not know about it.

The practical implication: always verify current information independently. Do not rely on AI for time-sensitive information. If you need to know what is happening now, use the internet. If you need analysis of historical information (before the knowledge cutoff), AI can help. If you need current analysis of recent events, combine AI with current information sources.

Domain Knowledge: The Breadth Boundary

Breadth vs. Depth

AI systems trained on internet-scale data have broad knowledge: they know about most topics covered on the internet. But breadth does not mean depth. An AI might have been trained on millions of web pages about a topic (giving it broad knowledge) but lack the deep, specialized understanding of an expert in that field.

Imagine asking an AI about dermatology. It has been trained on thousands of web pages about skin diseases, medical textbooks, research papers, and forum discussions. It can explain general concepts and has seen diverse case descriptions. But a dermatologist with 20 years of practice knows subtle things about rare variants, has intuition developed through thousands of direct patient interactions, and understands contextual factors that might not be obvious from reading alone.

Dangerous Confidence in Unfamiliar Domains

The problem is that AI's confidence does not track with accuracy. An AI can be equally fluent and confident explaining a common topic (that it has lots of training data for) and a rare, specialized topic (that it has sparse training data for). It has no built-in way to signal "I am less confident about this."

This is particularly dangerous in specialized domains like medicine, law, and engineering, where incorrect information can have serious consequences. A doctor asks an AI about treatment options for a rare disease. The AI generates a plausible-sounding answer based on limited training data. The doctor, not realizing the AI has no deep expertise in this rare disease, follows the recommendation. If the recommendation was wrong, there can be serious consequences.

Recognizing Domain Boundaries

How do you know if an AI is likely to have good knowledge about a topic? Consider:

  • Commonness: Is this a topic that lots of people write about? Common topics are covered extensively in training data. Rare topics might be barely covered.
  • Specialized vs. general: Is this general knowledge that appears in many sources, or highly specialized knowledge available only to experts? General knowledge is more likely to be in training data.
  • Recent vs. historical: Is this recent research in an evolving field, or well-established knowledge? Established knowledge is more likely to be thoroughly represented in training data.
  • Controversial or settled: Is there disagreement among experts, or is there consensus? AI can handle settled questions better than controversial ones.

Hallucination: AI Making Things Up

What Is Hallucination?

Hallucination is when an AI generates false information that sounds plausible. It cites sources that do not exist, invents statistics, or describes events that never happened. And it does all of this confidently, as if stating facts.

Hallucination happens because of how AI works. It generates responses token-by-token (word-by-word), predicting the next token based on probability given the previous tokens. If you ask it about something outside its training data, it still generates a response. That response is coherent (it follows grammatical rules and semantic patterns), but it can be factually false because the AI was never exposed to correct information about the topic.

Why AI Cannot Easily Tell When It Is Hallucinating

You might expect an AI to say "I do not know" when asked about something it was not trained on. But this is hard for AI. The AI was trained to complete text, not to evaluate its own knowledge. It has no internal mechanism to check "Do I actually know this, or am I just generating plausible-sounding text?"

An AI trained to never admit uncertainty (a poorly designed system) will confidently state false information rather than express doubt. Even an AI designed to be humble will sometimes still hallucinate. The problem is fundamental to how these systems work.

Examples of Hallucination

Fabricated sources: You ask an AI "What does Professor Smith say about machine learning ethics?" The AI cites a paper or book by Professor Smith on this topic that sounds plausible. You later try to find the paper and discover it does not exist. The AI hallucinated it.

Invented statistics: You ask "What percentage of companies use AI?" The AI responds with a specific statistic: "42% of companies use AI in some capacity." The statistic sounds precise and plausible, but it might be completely made up.

Misremembered facts: You ask about a historical event. The AI provides details that are partially correct (the event did happen) but with wrong specifics (wrong date, wrong people involved, wrong outcome). The AI mixed facts with probability-based guessing and created a plausible-sounding false version.

Using AI Responsibly Given These Limits

Know the Knowledge Cutoff

Always check what date the AI's knowledge extends to. Do not ask about current events and expect accuracy. If you need current information, use the internet. If you need analysis of older information, ask the AI and then verify the facts.

Verify Important Information

If the information is important (you are going to make a decision based on it, or share it with others, or it affects someone's health or safety), verify it independently. Use original sources. Check multiple sources. Do not rely solely on AI.

Recognize Domain Limitations

Be more skeptical of AI output in specialized domains where you are not an expert. If you are asking about something highly specialized and rare, assume the AI might be hallucinating. Consult actual experts for important specialized knowledge.

Cross-Check Facts and Citations

If the AI cites a source, try to verify that source exists and says what the AI claims it says. Many AI hallucinations include invented or misquoted sources. Verify before relying on citations.

Use AI for Brainstorming and Drafting, Not Facts

AI is excellent at generating ideas, creating structure, and drafting text. Use it for these purposes. But fact-checking and verification should be your job. Do not abdicate responsibility for accuracy to the AI.

Key Takeaway

AI systems have real knowledge boundaries: they only know what is in their training data, they have a knowledge cutoff date beyond which they know nothing, they can hallucinate confidently, and they are less reliable in specialized domains with sparse training data.

Understanding these limitations is not about distrusting AI—it is about using it appropriately. Use AI where it is strong (brainstorming, drafting, explaining concepts, analyzing historical information). Verify important information independently. Recognize when you are in AI's zone of uncertainty. This approach lets you leverage AI's strengths while avoiding its pitfalls.

What Comes Next

Now that you understand what AI knows and does not know, Chapter 7.2 shifts focus: how your data is used by AI providers. You will learn how AI systems collect, store, and use data about you and your organization.

Chapter Details
Reading Time ~50 minutes
Difficulty Beginner
Prerequisites Lesson 7 Overview

Part of Lesson 7: Data
Lesson 7 Chapters
7.1 What AI Knows Current
7.2 Your Data & AI 7.3 Data-Informed Decisions