🧠 Faces Can Lie – Deep Face Identity Analysis

Overview

This project explores how deep learning can be used to detect, compare, and analyze human faces from a single collage image.
The main goal was to check whether all faces in the image belong to the same person or to multiple people using modern face-recognition techniques.

The project demonstrates how feature embeddings, similarity scores, and clustering can reveal patterns of identity even when appearances vary due to lighting, disguise, or pose.

🔍 Motivation

Human eyes can be easily tricked by changes in expression, lighting, and disguises — but computers analyze faces mathematically.
This project aims to show how a machine learning model sees faces numerically using cosine similarity, and how this can expose the truth behind the phrase “faces can lie.”

⚙️ Methodology

1. Face Detection

The image (faces_can_lie.jpg) was processed using the InsightFace (ArcFace) model, which provides a pre-trained detector and face-recognition network.
The model automatically locates all faces in the collage.
A total of 15 faces were detected and cropped.

2. Feature Extraction (Embeddings)

For every detected face, the ArcFace model generates a 512-dimensional embedding vector that represents the person’s identity.
These embeddings encode:

Facial geometry (eyes, nose, mouth structure)
Texture and skin patterns
Relative symmetry
Local contrasts and fine details

Each embedding is normalized to unit length, meaning its direction represents the person’s identity independent of brightness or scale.

3. Similarity Computation

To compare faces, cosine similarity is calculated between every pair of embeddings:

$\mathrm{similarity}(A,B)=\frac{A\cdot B}{||A||||B||}$

1.0 → Identical / same person
0.0 → Completely different
Negative → Opposite directions (very dissimilar)

A 15×15 similarity matrix was built and visualized as a heatmap, along with a histogram of all pairwise similarities.

4. Clustering and Decision

Using Agglomerative Clustering on the cosine distance (1 − similarity), faces were automatically grouped into identity clusters.
A similarity threshold of 0.45 was used to decide whether two faces represent the same person.
Clusters closer than this threshold were merged.

5. Output Summary

All results — detected faces, plots, and similarity data — are saved in the /outputs folder for review.

📊 Results and Observations

• Detected Faces

15 faces were found and cropped successfully.

• Similarity Matrix

The heatmap shows strong color only along the diagonal (self-comparisons), while most other cells are darker — meaning most faces are dissimilar.

• Similarity Histogram

The histogram peaks around 0.1 – 0.3, showing that most pairs have very low similarity.
Only one or two pairs exceed 0.45, indicating almost all faces belong to different individuals.

• Quantitative Summary

Metric	Value
Faces detected	15
Estimated unique identities	14
Most similar pair	Faces 6 & 7 (≈ 0.46)
Average similarity	≈ 0.23

Interpretation:
Almost every face in the collage is unique.
Only one pair shows moderate similarity, likely the same person captured twice.

📈 Score Trend Analysis

The similarity distribution is heavily skewed toward low values (0.1 – 0.3).
That means the model sees these faces as different identities.
If all were the same person in disguises, most scores would have been higher (around 0.6–0.8).
The clustering confirms this trend by grouping faces into 14 distinct identities.

🧮 Metric Used

Cosine Similarity: main metric for comparing faces.
Cosine Distance: used for clustering (distance = 1 − similarity).
Threshold: 0.45 chosen empirically for same/different classification.

⚠️ Factors That Affect the Results

Factor	Example	Effect
Lighting	Shadows or uneven exposure	Alters contrast → lowers similarity
Pose	Side or tilted faces	Reduces overlap of key landmarks
Occlusion	Glasses, beard, hats, scarves	Hides facial regions
Image Quality	Blurry or low-resolution crops	Weakens embeddings
Alignment	Poorly centered face	Mis-represents geometry
Model Bias	Dataset limitations	Affects accuracy for certain face types
Threshold Value	Slight changes	Can merge or split clusters

🧰 Tech Stack

Language: Python 3.10
Libraries: InsightFace, OpenCV, NumPy, Matplotlib, Scikit-Learn
Environment: macOS / CPUExecutionProvider

🧠 What This Project Demonstrates

How to automatically detect and crop faces from an image.
How to use embeddings to represent facial identity numerically.
How to apply cosine similarity and clustering to estimate identities.
How disguises, lighting, and angle can fool visual recognition but still be measurable through data.

🧩 Sample Output

After running, the /outputs folder will contain:
Cropped face images
Heatmap of pairwise similarities
Histogram of similarity distribution
JSON file summarizing cluster information

📘 Conclusion

This project shows how machine learning models like ArcFace can analyze visual identity beyond what humans easily notice. By converting faces into embeddings and comparing them using cosine similarity, we can mathematically determine identity relationships between multiple faces — even when appearances change. The results clearly show that “faces can lie” — appearances may look similar, but numbers don’t.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
outputs		outputs
README.md		README.md
code.py		code.py
faces_can_lie.jpg		faces_can_lie.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Faces Can Lie – Deep Face Identity Analysis

Overview

🔍 Motivation

⚙️ Methodology

1. Face Detection

2. Feature Extraction (Embeddings)

3. Similarity Computation

4. Clustering and Decision

5. Output Summary

📊 Results and Observations

• Detected Faces

• Similarity Matrix

• Similarity Histogram

• Quantitative Summary

📈 Score Trend Analysis

🧮 Metric Used

⚠️ Factors That Affect the Results

🧰 Tech Stack

🧠 What This Project Demonstrates

🧩 Sample Output

📘 Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Faces Can Lie – Deep Face Identity Analysis

Overview

🔍 Motivation

⚙️ Methodology

1. Face Detection

2. Feature Extraction (Embeddings)

3. Similarity Computation

4. Clustering and Decision

5. Output Summary

📊 Results and Observations

• Detected Faces

• Similarity Matrix

• Similarity Histogram

• Quantitative Summary

📈 Score Trend Analysis

🧮 Metric Used

⚠️ Factors That Affect the Results

🧰 Tech Stack

🧠 What This Project Demonstrates

🧩 Sample Output

📘 Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages