Lesson Overview
Data is the foundation of AI. Understanding data—what it represents, what it is missing, what it says and does not say—is essential for using AI effectively and responsibly. This lesson focuses on three dimensions of data awareness:
What You Will Learn
- What AI Knows & Doesn't Know (7.1): Training data scope, knowledge cutoffs, data limitations, why AI can confidently state wrong information about recent events or specialized domains
- Your Data & AI Systems (7.2): How your personal and organizational data is used by AI providers, privacy policies, data retention, how to opt out if possible, and your rights over your data
- Data-Informed Decision Making (7.3): Using AI-generated analysis responsibly, understanding statistical claims, avoiding over-reliance on AI, and combining AI insights with human judgment
Most AI failures are not failures of the algorithm. They are failures of data. Understanding what data says and does not say, what is missing, and what claims you can and cannot make from data is crucial for responsible AI use. Data literacy is becoming as important as traditional literacy.
How This Lesson Is Structured
Each chapter builds your understanding of different aspects of data in AI systems. By the end, you will have practical frameworks for thinking about data: what questions to ask about AI system inputs, how to think about your own data privacy, and how to use AI-generated insights without being fooled by them.
Three Core Data Concepts
1. Data Shapes AI's Knowledge
AI systems can only know what is in their training data. If the training data is missing entire domains (recent events after a knowledge cutoff, specialized technical fields, niche cultural knowledge), the AI will not know those things either. AI systems do not have access to the internet in real-time. They do not learn continuously from interactions. They know what they were trained on, nothing more.
2. Data Is Created by People
All data is produced by humans, collected by humans, labeled by humans, and reflects human choices and biases. Understanding who created the data, how it was collected, and what context it reflects is crucial for understanding what it actually represents. Data that looks objective often embeds subjective choices.
3. Decisions Need Both Data and Judgment
Data and AI can inform decisions, but they cannot replace human judgment. The most important decisions combine data-driven insights with human context, values, and understanding. Over-reliance on AI analysis (trusting the number without questioning it) is a common failure mode.
What This Lesson Prepares You For
After completing Lesson 7, you will be able to:
- Critically evaluate AI claims about what it knows and does not know
- Ask the right questions about AI training data and scope
- Understand privacy policies and your rights over your data
- Use AI-generated analysis as input to decision-making without being fooled by it
- Explain to non-technical people why AI sometimes confidently states wrong information
- Make decisions that combine data-driven insights with human judgment