Content Navigation
1. DeepFaceLab Software Installation
The official original author’s software ends with an .exe extension, which is essentially a 7z compressed file. Choose an appropriate path and extract it.
- Antivirus False Positives
- A few users reported antivirus alerts and file deletions causing incomplete project files and errors.
- DeepFaceLab3 is a large open-source project on GitHub, verified safe by countless users.
- While unlikely, third-party tampering (“poisoning”) is possible. This guide uses the official original version.
- A few users reported antivirus alerts and file deletions causing incomplete project files and errors.
- Choosing an Installation Path
- Avoid non-english characters in the path.
- Avoid overly deep directory hierarchies to simplify navigation and operations.
- Key Notes
- For AMD GPUs: Use DirectX12 version, not the RTX3000 edition.
- Update your graphics drivers.
2. Convert Source Video to Images
- Double-click the batch file:
extract images from video data_src.bat
. - A command window will open:
- FPS: Enter
10
(frames per second) and press Enter. For example, a 30 FPS video will extract 10 frames per second. Reducing frames saves processing time and resources. - Format: Enter
jpg
(recommended). JPG balances quality and file size. Use PNG only for lossless requirements.
- FPS: Enter
- After processing, images will appear in
workspace/data_src
.
3. Convert Target Video to Images
- Double-click the batch file:
extract images from video data_dst FULL FPS.bat
. - No FPS adjustment here—every frame is extracted. Choose
jpg
format. - Output images will populate
workspace/data_dst
.
4. Extract Source Faces
- Double-click the batch file: data_dst faceset extract.bat.
- Run the Batch File: Double-click the script
data_dst faceset extract.bat
to start the process. - Face Type (Options: h/mf/f/wf/head |
?:help
for details):- What is Face Type? It determines the size of the facial area extracted from the original image.
- Definitions:
h
(half-face): Covers half the face, used in early Deepfake models (3+ years ago) due to limited VRAM.mf
(mid-half-face): Slightly larger thanh
.f
(full-face): Captures the full face.wf
(whole-face): Includes the entire face and some surrounding area. At 256px resolution, the effective facial area is about half, but modern GPUs can handle higher resolutions.head
: Extracts the entire head, which may waste VRAM unnecessarily.
- Recommendation: **Use
wf
** for optimal balance—it captures the full face while remaining compatible withf
-type datasets. Avoidh
/mf
(too restrictive) andhead
(overkill).
- Max Faces per Image (Default: 0 |
?:help
for details):- If your images contain multiple faces, set this to limit extraction (e.g.,
3
speeds up processing). 0
= No limit (extract all detected faces).
- If your images contain multiple faces, set this to limit extraction (e.g.,
- Image Size (Range: 256–2048 | Default: 512 |
?:help
for details):- Use 512px unless your source images are exceptionally high quality and require no enhancement.
- Higher resolutions improve facial details but demand more resources.
- Image Quality (JPEG Quality: 1–100):
- Higher values (e.g., 90–100) produce larger files but retain more detail.
- Post-Processing:
- After extraction, check the output folder
workspace\data_src\aligned
. - Cleanup Tips (for complex datasets):
- Delete blurry or low-resolution faces.
- Remove non-target individuals.
- Discard incomplete or partially obscured faces.
- Eliminate images with extreme lighting variations or heavy obstructions (e.g., hair, hands).
- After extraction, check the output folder
5. Extract Target Faces
- Run the Batch File: Double-click the script
data_dst faceset extract.bat
to begin extraction. - Process Details: The steps here mirror those for extracting source faces (see previous section).
- Output & Filtering:
- Extracted faces are saved in the
data_dst/aligned
folder. After extraction, curate the dataset:- Delete files ending with
_1
: These often represent duplicate or low-confidence detections. - Core Rule: Keep only faces you intend to replace (e.g., the target person). Remove all unrelated faces.
- Delete files ending with
- Extracted faces are saved in the
- Debug Folder (
data_dst/aligned_debug
):- Open any image in this folder to visualize the face detection results. Three colored overlays are displayed:
- Red: The region cropped for the face image.
- Blue: The algorithm’s detected facial area.
- Green: Facial landmarks (contours and key points like eyes, nose, mouth).
- Use these debug images to verify whether faces are accurately identified and aligned. Misaligned or missed detections may require manual cleanup or adjustments to extraction settings.
- Open any image in this folder to visualize the face detection results. Three colored overlays are displayed:
This workflow ensures precise targeting of faces for replacement while minimizing noise in the dataset.
6. Training the Model
This is the most time-consuming phase of the process. The latest DeepFaceLab version offers three model types: Quick96, SAEHD, and AMP. For this tutorial, we’ll use Quick96—a lightweight model optimized for lower VRAM requirements and faster training times. The trade-offs are limited customization options, lower resolution, and slightly reduced output quality compared to advanced models.
6.1 Step-by-Step Guide
- Launch Training
- Double-click the batch file:
train Quick96.bat
.
- Double-click the batch file:
- Initial Setup
- First run: If no existing model is found, you’ll be prompted to name your new model. Press Enter to confirm.
- Existing models: Choose between resuming training on a saved model or creating a new one.
- Select Hardware
- Input
0
to select your GPU (default choice) and press Enter.
- Input
- Monitor Training Progress
- After initialization, the command window displays real-time metrics:
- [Timestamp] [Iteration] [Time/Iter] [Src Loss] [Dst Loss]
[16:25:30] [#000002] [0059ms] [4.2341] [3.7194]
- Key metric: Focus on Dst Loss (Destination Loss). Lower values indicate better alignment, with 0.1x being a practical target.
- Preview Window
- Auto-launch: The preview window opens automatically.
- Refresh previews: Hover your mouse over the window and press
P
(some systems may require an additional Enter press). - Shortcuts:
Enter
: Stop training and save progress.Space
: Toggle views (helpful for debugging).S
: Manually save without stopping.- P: Refresh previews.
- Preview layout:
- Column 1: Source faces.
- Column 2: Model-generated approximations of source faces.
- Column 3: Target faces.
- Column 4: Model-generated approximations of target faces.
- Column 5: Blending result (expression alignment).
- As the number of iterations increases, the loss values gradually decrease. However, 20,000 iterations are far from sufficient—when training a model from scratch, achieving decent results typically requires over 1 million iterations.
- Iteration benchmarks:
20k iterations: Barely scratches the surface.
100k+ iterations: Minimum for basic usability.
1M+ iterations: Ideal for high-quality outputs.
- When to Stop Training?
- Loss Value Check
- Aim for Dst Loss ≤ 0.1 (e.g., 0.12–0.15). Below 0.1 yields diminishing returns.
- Visual Inspection
- Column 2 should closely resemble Column 1 (source face replication).
- Column 4 should align with Column 3 (target face integration).
- Column 5 must show natural expression transfer and sharp details.
- Loss Value Check
7. Apply the Model
- Launch the Merge Process
- Double-click the batch file: merge Quick96.bat (refer to Image-1).
- Input parameters as shown in Image-2.
- Double-click the batch file: merge Quick96.bat (refer to Image-1).
- Shortcut Reference Interface
- A window will pop up (Image-3), displaying available keyboard shortcuts with Chinese annotations.
- Note: This screen has no functional purpose—it simply lists shortcuts. For detailed parameter explanations, refer to future articles.
- Critical step: Ensure your input method is switched to English, then press Tab to enter the preview/editing interface.
- A window will pop up (Image-3), displaying available keyboard shortcuts with Chinese annotations.
- Post-Processing Workflow
- This phase resembles Photoshop-style retouching to enhance facial blending realism. Key tools include:
- Feathering (soften edges)
- Brightness/Contrast adjustments
- Sharpening/Denoising (refine details)
- Key controls:
- W/S: Adjust face opacity/blending intensity.
- E/D: Tweak sharpening strength.
- This phase resembles Photoshop-style retouching to enhance facial blending realism. Key tools include:
- Preview Comparison
-
- Left panel: Raw output (faces appear “pasted” with visible seams).
- Right panel: Processed result (natural integration achieved through adjustments).
- Apply Settings to All Frames
- Press Shift+? (applies current adjustments to subsequent frames).
- Manual frame navigation: Use < and > keys to scrub through frames.
- Start Automated Merging
- Press Shift+> to begin full-video processing.
- Completion: Close the CMD window manually once the progress bar hits 100%.
- Output Results
- Two new folders will appear:
- merged: Final blended frames.
- merged_mask: Grayscale masks for advanced editing (e.g., selective adjustments in compositing software).
8. Final Video Synthesis
- Generate MP4 Video
- Double-click the batch file: merged to mp4.bat.
- Bitrate setting: Input
3
(recommended default). The script automatically inherits the source video’s metadata, including frame rate and audio tracks.
- Output Files
- After processing, two files appear in the
workspace
folder:- result.mp4: Final deepfake video.
- result_mask.mp4: Mask video for post-production adjustments (e.g., fine-tuning blending in editing software)
- After processing, two files appear in the
- Review Results
- Play result.mp4 to verify facial alignment, lighting consistency, and overall realism.
9. Conclusion
- Complex but Manageable
- While the workflow appears daunting due to its many steps, this guide breaks down every critical detail. Follow the instructions meticulously, and you’ll successfully complete your first deepfake project.
- Not a One-Click Solution
- DeepFaceLab is not a “magic button” tool—it requires patience, experimentation, and iterative refinement to achieve professional results.
- Mastery Demands Investment
- High-quality outputs hinge on understanding nuances: curating datasets, tuning model parameters, and mastering post-processing.
- Stay Tuned
- Future tutorials will cover advanced features:
- SAEHD/AMP models for higher resolution.
- GAN training to enhance texture realism.
- Frame interpolation for smoother motion.
- Future tutorials will cover advanced features: