English translation
Code example illustrating basic data protection
Security Risk Assessment Framework
Security focuses on whether a system is vulnerable to unauthorized access, tampering, or disruption; privacy focuses on whether personal data is being excessively collected, used beyond its intended scope, or retained longer than necessary. Although security and privacy overlap, they are not interchangeable.
When evaluating an AI capability, I write two columns: on the left, “Can an attacker get in?”; on the right, “Is user data being used appropriately?” Only when both columns pass can the system be considered reasonably robust.
Defining Security and Privacy
Against the backdrop of widespread artificial intelligence (AI) adoption, security and privacy have become increasingly prominent concerns. Before delving into how AI impacts our daily lives, we must first clarify the core definitions of “security” and “privacy.”
To apply the Introduction to your own tasks, begin by narrowing the scope—focus on validating just one critical assessment point.
After studying the Introduction, try applying it to a scenario of your own—pay particular attention to whether inputs, processing steps, and outputs align coherently.
1. Defining Security
In information technology, security generally refers to the ability to protect information systems from unauthorized access, attacks, or compromise. Security encompasses not only data integrity and availability but also confidentiality. In AI applications, security issues commonly manifest as:
-
Data breaches: Occur when sensitive information—such as personally identifiable information (PII) or financial data—is accessed or disclosed by unauthorized entities. For example, certain social media platforms suffered massive data breaches due to improper storage of user data, eroding user trust and triggering legal liability.
-
Model attacks: Attackers may exploit knowledge of machine learning models to launch various adversarial attacks—for instance, adversarial examples. In such attacks, hackers introduce carefully crafted perturbations to input data to cause misclassification—for example, slightly modifying an image of a cat so that the model incorrectly classifies it as a dog.
2. Defining Privacy
Privacy generally refers to an individual’s right to control their personal information—that is, personal data should not be accessed, used, or disseminated by unauthorized third parties. In AI contexts, privacy concerns primarily arise in the following areas:
When reading foundational security material, first categorize risk entry points into four domains: data, models, interfaces, and users. For each domain, identify observable signals and corresponding mitigation actions—so security concerns move beyond slogans into actionable practice.
-
Personal data collection: Many AI systems require large volumes of training data, often containing personal information. For example, intelligent assistants (e.g., Siri, Alexa) analyze users’ voice commands and usage patterns to deliver personalized services—yet this entails collecting and storing vast amounts of private data in the cloud.
-
Data processing and transparency: After collection, how data is processed and used frequently lacks transparency. Users rarely understand how their data is analyzed or shared, fueling distrust in intelligent systems. For instance, a major U.S. healthcare institution faced user complaints after failing to clearly disclose how collected patient data would be used.
3. Bridging Security and Privacy
In practice, security and privacy are not independent concepts. To some extent, privacy can be viewed as a subset of security: protecting privacy inherently requires safeguarding data. For example, even if a system encrypts users’ personal information, inadequate overall security—such as unpatched vulnerabilities or exploitable design flaws—may still allow attackers to access that data illegally. Thus, integrating security measures with privacy protection strategies is essential.
4. Case Study
A classic example is the 2017 Equifax data breach, which exposed the personal information—including Social Security numbers, birth dates, and addresses—of approximately 143 million U.S. consumers. Due to insufficient protection, countless individuals faced heightened risks to their information security—and the incident sparked widespread public dissatisfaction and debate over the company’s privacy safeguards.
While reading the Introduction, treat its illustrations as navigational aids: first grasp the overall flow, then examine why each step is taken, and finally verify boundary conditions.
# Code example illustrating basic data protection
import hashlib
def protect_data(data):
# Hash data using SHA256
return hashlib.sha256(data.encode()).hexdigest()
user_data = "sensitive_information"
hashed_data = protect_data(user_data)
print(f"Protected data: {hashed_data}")
In the above example, hashing sensitive information with SHA256 represents a fundamental security measure—but it alone cannot fully prevent unauthorized data access.
Summary
In the age of AI, understanding the precise definitions of security and privacy is vital for developing responsible AI systems. As technology continues to evolve, organizations and developers must continually update and strengthen their security and privacy protections to keep pace with emerging threats and challenges.
Next, we will outline the goals and structure of this tutorial—providing a comprehensive knowledge framework to deepen your understanding of AI security and privacy.
Continue