Initial Commit

This commit is contained in:
2025-12-11 23:26:21 +01:00
commit a23f189267
12 changed files with 1255 additions and 0 deletions

5
.env Normal file
View File

@@ -0,0 +1,5 @@
FLASK_ENV=development
UPLOAD_FOLDER=uploads
SEGMENTED_FOLDER=segmented
ALLOWED_EXTENSIONS=.mp4,.avi,.mov,.mkv
SAM_MODEL_SIZE=vit_b

13
.gitignore vendored Normal file
View File

@@ -0,0 +1,13 @@
# Python-generated files
__pycache__/
*.py[oc]
build/
dist/
wheels/
*.egg-info
segmented/
uploads/
# Virtual environments
.venv

1
.python-version Normal file
View File

@@ -0,0 +1 @@
3.12

210
README.md Normal file
View File

@@ -0,0 +1,210 @@
# Video Object Segmentation with SAM2
A web application that allows you to upload videos, click on objects, and segment them out using Meta's SAM2 (Segment Anything Model 2) AI model.
## Features
- 📤 Upload video files (MP4, AVI, MOV, MKV)
- 🖼️ Preview first frame of the video
- 🎯 Click on objects to select them for segmentation
- ✂️ AI-powered object segmentation using SAM2
- 🎥 Download segmented video results
- 🎨 Beautiful, responsive user interface
## Requirements
- **Python 3.8-3.12** (tested and compatible)
- PyTorch 2.2.0+ with CUDA (recommended for GPU acceleration)
- Flask
- OpenCV
- NumPy
- Segment Anything Model 2
**Python Version Compatibility:**
- ✅ Python 3.8, 3.9, 3.10, 3.11, 3.12 all supported
- 🔄 Automatic torch version selection based on Python version
- 💡 Python 3.12 users: Use the updated requirements (torch 2.2.0+)
## Why use uv?
We recommend using **uv** for this project because:
**Faster dependency resolution**: uv is significantly faster than pip
**Better virtual environment management**: Cleaner and more reliable venvs
**Deterministic builds**: More consistent dependency resolution
**Modern Python tooling**: Built with Rust for performance
**Better compatibility**: Handles complex dependency trees better
If you're working on Python projects, uv is a great modern alternative to pip + virtualenv!
## Installation
### 1. Clone the repository
```bash
git clone https://github.com/yourusername/video-segmentation-sam2.git
cd video-segmentation-sam2
```
### 2. Install dependencies (using uv - recommended)
First, install uv (a fast Python package installer and virtual environment manager):
```bash
# Install uv
pip install uv
# Create virtual environment and install dependencies
uv venv .venv
source .venv/bin/activate # On Windows: .\.venv\Scripts\activate
uv pip install -r requirements.txt
```
### 3. Install SAM2 manually
SAM2 needs to be installed manually from GitHub:
```bash
# Clone the segment-anything repository
git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
# Install SAM2 in development mode
pip install -e .
# Download the model checkpoint (ViT-B recommended)
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
mv sam_vit_b_01ec64.pth ..
cd ..
# Clean up
rm -rf segment-anything
```
This will:
- Create a virtual environment using uv
- Install all Python dependencies
- Install SAM2 from source
- Download the ViT-B model checkpoint
- Set up necessary directories
### 3. Alternative: Standard pip installation
If you prefer not to use uv:
```bash
pip install -r requirements.txt
python setup.py
```
### 3. Download SAM2 model weights
The application now uses **ViT-B** (smaller, faster model) by default. You need the file `sam_vit_b_01ec64.pth` in the root directory.
Download from: [https://github.com/facebookresearch/segment-anything](https://github.com/facebookresearch/segment-anything)
Or use the following command:
```bash
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
```
**Model Options:**
- `vit_b` (default): Fastest, good for testing - `sam_vit_b_01ec64.pth`
- `vit_l`: Medium size/performance - `sam_vit_l_0b3195.pth`
- `vit_h`: Best accuracy, largest - `sam_vit_h_4b8939.pth`
You can change the model by modifying `SAM_MODEL_SIZE` in `app.py`.
### 4. Run the application
```bash
python app.py
```
The application will start on `http://localhost:5000`
## Usage
1. **Upload a video**: Click the "Select Video File" button and choose a video file
2. **Select object**: Click on the object you want to segment in the preview image
3. **Add more points**: Click additional points to help the AI better understand the object
4. **Segment**: Click "Segment Object" to start the segmentation process
5. **Download**: Once processing is complete, preview and download your segmented video
## Configuration
You can configure the application by editing the `.env` file:
```env
FLASK_ENV=development
UPLOAD_FOLDER=uploads
SEGMENTED_FOLDER=segmented
ALLOWED_EXTENSIONS=.mp4,.avi,.mov,.mkv
```
## Technical Details
### Backend
- **Flask**: Web framework
- **SAM2**: Segment Anything Model 2 for object segmentation
- **OpenCV**: Video processing and frame manipulation
- **PyTorch**: Deep learning framework for running SAM2
### Frontend
- **HTML5/CSS3**: Responsive user interface
- **JavaScript**: Interactive point selection and AJAX requests
- **Base64 encoding**: For preview image transfer
### Processing Pipeline
1. Video upload and first frame extraction
2. User selects points on the object to segment
3. SAM2 processes each frame with the selected points
4. Masks are applied to each frame
5. Processed frames are combined into a new video
## Performance Considerations
- **GPU recommended**: SAM2 runs much faster with CUDA-enabled GPU
- **Video length**: Longer videos will take more time to process
- **Resolution**: Higher resolution videos require more processing power
- **Points selection**: More points can help with complex objects but may slow down processing
## Troubleshooting
### Common Issues
**Issue: SAM2 model not found**
- Solution: Download the model checkpoint and place it in the root directory
**Issue: CUDA out of memory**
- Solution: Reduce video resolution or use smaller batch sizes
**Issue: Slow processing on CPU**
- Solution: Use a machine with GPU or reduce video resolution
**Issue: Video format not supported**
- Solution: Convert your video to MP4 format
## License
This project is licensed under the MIT License. The SAM2 model is provided by Meta Research under its own license.
## Acknowledgements
- Meta Research for the Segment Anything Model
- Flask team for the web framework
- OpenCV team for computer vision tools
## Future Improvements
- [ ] Add support for multiple object segmentation
- [ ] Implement background removal options
- [ ] Add video trimming functionality
- [ ] Support for real-time preview
- [ ] Batch processing of multiple videos
- [ ] Advanced segmentation parameters (threshold, etc.)
- [ ] Cloud deployment options

140
SETUP_GUIDE.md Normal file
View File

@@ -0,0 +1,140 @@
# Video Object Segmentation with SAM2 - Setup Guide
## Quick Start (macOS/Linux)
### 1. Install uv
```bash
pip install uv
```
### 2. Create virtual environment and install dependencies
```bash
uv venv .venv
source .venv/bin/activate
uv pip install -r requirements.txt
```
### 3. Install SAM manually
```bash
# Clone the segment-anything repository
git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
# Install SAM in development mode
pip install -e .
# Download the ViT-B model checkpoint (recommended)
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
mv sam_vit_b_01ec64.pth ..
cd ..
# Clean up
rm -rf segment-anything
```
### 4. Create directories
```bash
mkdir -p uploads segmented
```
### 5. Run the application
```bash
python app.py
```
## Project Structure
```
video-segmentation-sam2/
├── app.py # Main Flask application
├── requirements.txt # Python dependencies
├── .env # Configuration
├── templates/
│ └── index.html # Web interface
├── uploads/ # Uploaded videos (created automatically)
├── segmented/ # Processed videos (created automatically)
└── sam_vit_b_01ec64.pth # SAM2 model checkpoint
```
## Configuration
Edit `.env` file to customize:
```env
FLASK_ENV=development
UPLOAD_FOLDER=uploads
SEGMENTED_FOLDER=segmented
ALLOWED_EXTENSIONS=.mp4,.avi,.mov,.mkv
SAM_MODEL_SIZE=vit_b # Options: vit_b, vit_l, vit_h
```
## Troubleshooting
### NumPy compatibility issues
If you see errors about NumPy 2.x compatibility:
```
A module that was compiled using NumPy 1.x cannot be run in NumPy 2.3.5
```
**Solution:** The requirements.txt already specifies `numpy<2.0` to avoid this issue. Make sure you:
1. Delete your virtual environment: `rm -rf .venv`
2. Recreate it: `uv venv .venv`
3. Reinstall dependencies: `uv pip install -r requirements.txt`
### SAM2 not found
If you get `ImportError: SAM2 is not installed`, make sure you:
1. Cloned the segment-anything repository
2. Ran `pip install -e .` from the segment-anything directory
3. Have the checkpoint file in the root directory
### CUDA not available
If you don't have a GPU, the app will use CPU (slower). For better performance:
1. Install CUDA toolkit
2. Install cuDNN
3. Make sure `torch.cuda.is_available()` returns True
### Port already in use
If port 5000 is busy:
1. Change the port in `app.py` (last line)
2. Or kill the process using port 5000:
```bash
lsof -i :5000
kill -9 <PID>
```
## Model Options
| Model | Checkpoint File | Size | Speed | Accuracy | Best For |
|-------|----------------|------|-------|----------|----------|
| **ViT-B** | `sam_vit_b_01ec64.pth` | Smallest | Fastest | Good | Testing, quick results, lower-end hardware |
| **ViT-L** | `sam_vit_l_0b3195.pth` | Medium | Medium | Better | Balanced performance/quality |
| **ViT-H** | `sam_vit_h_4b8939.pth` | Largest | Slowest | Best | High-quality results, powerful hardware |
To change models:
1. Download the desired checkpoint
2. Update `SAM_MODEL_SIZE` in `.env`
3. Restart the application
## Using the Application
1. **Upload**: Select a video file (MP4, AVI, MOV, MKV)
2. **Preview**: See the first frame of your video
3. **Select**: Click on the object you want to segment
4. **Process**: Click "Segment Object" to start processing
5. **Download**: Get your segmented video
## Performance Tips
- **GPU Acceleration**: SAM2 runs much faster with CUDA
- **Video Length**: Shorter videos process faster
- **Resolution**: Lower resolutions are quicker to process
- **Points**: 3-5 well-placed points usually work best

364
app.py Normal file
View File

@@ -0,0 +1,364 @@
import os
import cv2
import numpy as np
from flask import Flask, request, jsonify, send_from_directory, render_template
from flask_cors import CORS
from werkzeug.utils import secure_filename
from dotenv import load_dotenv
import torch
from segment_anything import SamPredictor, sam_model_registry
import tempfile
import base64
from io import BytesIO
from PIL import Image
# Load environment variables
load_dotenv()
app = Flask(__name__)
CORS(app)
# Configuration
app.config['UPLOAD_FOLDER'] = os.getenv('UPLOAD_FOLDER', 'uploads')
app.config['SEGMENTED_FOLDER'] = os.getenv('SEGMENTED_FOLDER', 'segmented')
app.config['ALLOWED_EXTENSIONS'] = set(os.getenv('ALLOWED_EXTENSIONS', '.mp4,.avi,.mov,.mkv').split(','))
# Ensure directories exist
os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
os.makedirs(app.config['SEGMENTED_FOLDER'], exist_ok=True)
# Initialize SAM2 model
def initialize_sam2(model_size="vit_b"):
"""Initialize the SAM2 model"""
print(f"Initializing SAM2 model ({model_size})...")
try:
from segment_anything import SamPredictor, sam_model_registry
except ImportError:
raise ImportError(
"SAM is not installed. Please install it manually from GitHub:\n"
"git clone https://github.com/facebookresearch/segment-anything.git\n"
"cd segment-anything\n"
"pip install -e .\n"
"Then download the model checkpoint and place it in the root directory."
)
# Map model sizes to checkpoint files
model_configs = {
"vit_h": {
"checkpoint": "sam_vit_h_4b8939.pth",
"model_type": "vit_h"
},
"vit_l": {
"checkpoint": "sam_vit_l_0b3195.pth",
"model_type": "vit_l"
},
"vit_b": {
"checkpoint": "sam_vit_b_01ec64.pth",
"model_type": "vit_b"
}
}
if model_size not in model_configs:
raise ValueError(f"Unknown model size: {model_size}. Choose from: vit_h, vit_l, vit_b")
config = model_configs[model_size]
sam_checkpoint = config["checkpoint"]
model_type = config["model_type"]
# Check if checkpoint file exists
if not os.path.exists(sam_checkpoint):
raise FileNotFoundError(
f"SAM2 checkpoint file '{sam_checkpoint}' not found. "
f"Please download it from https://github.com/facebookresearch/segment-anything "
f"and place it in the root directory."
)
device = "cuda" if torch.cuda.is_available() else "cpu"
sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
sam.to(device=device)
predictor = SamPredictor(sam)
print(f"SAM2 model ({model_type}) initialized on {device}")
return predictor
# Global predictor instance
sam_predictor = None
# Configuration for SAM model
SAM_MODEL_SIZE = os.getenv('SAM_MODEL_SIZE', 'vit_b') # Read from .env or default to ViT-B
def allowed_file(filename):
"""Check if file has allowed extension"""
if '.' not in filename:
return False
# Get the file extension with dot (e.g., '.mp4')
file_extension = '.' + filename.rsplit('.', 1)[1].lower()
# Debug
print(f"🔍 Checking extension: {file_extension}")
print(f"📋 Allowed extensions: {app.config['ALLOWED_EXTENSIONS']}")
return file_extension in app.config['ALLOWED_EXTENSIONS']
@app.route('/')
def index():
"""Main page"""
return render_template('index.html')
@app.route('/test')
def test():
"""Test route"""
return jsonify({'status': 'ok', 'message': 'Flask app is running'})
@app.route('/upload', methods=['POST'])
def upload_video():
"""Handle video upload"""
print("📤 Upload request received")
if 'file' not in request.files:
print("❌ No file part in request")
return jsonify({'error': 'No file part'}), 400
file = request.files['file']
if file.filename == '':
print("❌ No selected file")
return jsonify({'error': 'No selected file'}), 400
print(f"📁 File received: {file.filename}")
print(f"📊 File size: {len(file.read())} bytes")
file.seek(0) # Reset file pointer after reading
# Debug file extension
filename = secure_filename(file.filename)
file_extension = filename.rsplit('.', 1)[1].lower() if '.' in filename else ''
print(f"🔍 File extension: .{file_extension}")
print(f"📋 Allowed extensions: {app.config['ALLOWED_EXTENSIONS']}")
if file and allowed_file(file.filename):
filepath = os.path.join(app.config['UPLOAD_FOLDER'], filename)
file.save(filepath)
print(f"✅ File saved: {filepath}")
# Extract first frame for preview
preview_frame = extract_first_frame(filepath)
if preview_frame is None:
print("⚠️ Could not extract preview frame, using placeholder")
return jsonify({
'message': 'File uploaded successfully (no preview available)',
'filename': filename,
'preview': None
})
print("🖼️ Preview frame extracted successfully")
return jsonify({
'message': 'File uploaded successfully',
'filename': filename,
'preview': preview_frame
})
else:
print(f"❌ File type not allowed: {file_extension}")
return jsonify({'error': f'File type .{file_extension} not allowed. Allowed types: {app.config["ALLOWED_EXTENSIONS"]}'}), 400
def extract_first_frame(video_path):
"""Extract first frame from video"""
try:
# Check if file exists
if not os.path.exists(video_path):
print(f"❌ Video file not found: {video_path}")
return None
cap = cv2.VideoCapture(video_path)
# Check if video opened successfully
if not cap.isOpened():
print(f"❌ Could not open video file: {video_path}")
return None
ret, frame = cap.read()
cap.release()
if ret and frame is not None:
# Convert to base64 for easy transfer
success, buffer = cv2.imencode('.jpg', frame)
if success:
frame_base64 = base64.b64encode(buffer).decode('utf-8')
print(f"✅ Successfully extracted first frame from {video_path}")
return frame_base64
else:
print(f"❌ Failed to encode frame as JPEG")
return None
else:
print(f"❌ Could not read first frame from {video_path}")
return None
except Exception as e:
print(f"❌ Error extracting first frame: {e}")
return None
@app.route('/segment', methods=['POST'])
def segment_object():
"""Handle object segmentation"""
print("🎯 Segment request received")
global sam_predictor
if sam_predictor is None:
print("🔧 Initializing SAM model...")
sam_predictor = initialize_sam2(SAM_MODEL_SIZE)
data = request.json
print(f"📥 Received data: {data}")
if not data or 'filename' not in data or 'points' not in data:
print("❌ Missing required parameters")
return jsonify({'error': 'Missing required parameters'}), 400
filename = data['filename']
points = data['points'] # Expecting [[x1, y1], [x2, y2], ...]
video_path = os.path.join(app.config['UPLOAD_FOLDER'], filename)
if not os.path.exists(video_path):
return jsonify({'error': 'Video file not found'}), 404
try:
# Process the video
output_path = process_video_segmentation(video_path, points)
output_filename = os.path.basename(output_path)
print(f"✅ Segmentation completed: {output_filename}")
print(f"📁 Output file path: {output_path}")
print(f"🔍 File exists: {os.path.exists(output_path)}")
print(f"📊 File size: {os.path.getsize(output_path)} bytes")
return jsonify({
'message': 'Segmentation completed',
'output_filename': output_filename,
'debug_file_path': output_path,
'debug_file_exists': os.path.exists(output_path)
})
except Exception as e:
return jsonify({'error': str(e)}), 500
def process_video_segmentation(video_path, points):
"""Process video segmentation using SAM2"""
global sam_predictor
# Create output filename
base_name = os.path.splitext(os.path.basename(video_path))[0]
output_filename = f"{base_name}_segmented.mp4"
output_path = os.path.join(app.config['SEGMENTED_FOLDER'], output_filename)
# Open video
cap = cv2.VideoCapture(video_path)
# Get video properties
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
# Create video writer
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
frame_count = 0
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
frame_count += 1
print(f"Processing frame {frame_count}/{total_frames}")
# Convert frame to RGB (SAM expects RGB)
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Set image for SAM
sam_predictor.set_image(frame_rgb)
# Convert points to numpy array
input_points = np.array(points)
input_labels = np.array([1] * len(points)) # 1 means foreground point
# Get masks
masks, scores, logits = sam_predictor.predict(
point_coords=input_points,
point_labels=input_labels,
multimask_output=False
)
# Create mask from the best prediction
mask = masks[0].astype(np.uint8) * 255
# Apply mask to frame (simple approach - you can customize this)
masked_frame = apply_mask_to_frame(frame, mask)
# Write frame
out.write(masked_frame)
cap.release()
out.release()
return output_path
def apply_mask_to_frame(frame, mask):
"""Apply mask to frame - simple implementation"""
# Create a colored version of the mask (red overlay)
colored_mask = np.zeros_like(frame)
colored_mask[:, :, 2] = mask # Red channel
# Blend the mask with the original frame
alpha = 0.5
result = cv2.addWeighted(frame, 1 - alpha, colored_mask, alpha, 0)
return result
@app.route('/download/<filename>')
def download_file(filename):
"""Download segmented video"""
return send_from_directory(
app.config['SEGMENTED_FOLDER'],
filename,
as_attachment=True
)
@app.route('/preview/<filename>')
def preview_video(filename):
"""Preview original video"""
return send_from_directory(
app.config['UPLOAD_FOLDER'],
filename
)
@app.route('/segmented/<filename>')
def serve_segmented_video(filename):
"""Serve segmented video with proper range request support"""
file_path = os.path.join(app.config['SEGMENTED_FOLDER'], filename)
print(f"🎬 Video request for: {filename}")
print(f"📁 Looking for file at: {file_path}")
print(f"🔍 File exists: {os.path.exists(file_path)}")
if not os.path.exists(file_path):
print(f"❌ File not found: {file_path}")
return jsonify({'error': f'File {filename} not found'}), 404
print(f"✅ Serving file: {file_path}")
# Use send_from_directory with proper MIME type for video
return send_from_directory(
app.config['SEGMENTED_FOLDER'],
filename,
conditional=True,
mimetype='video/mp4'
)
if __name__ == '__main__':
app.run(debug=True, port=5000)

7
pyproject.toml Normal file
View File

@@ -0,0 +1,7 @@
[project]
name = "censorall"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = []

14
requirements-uv.txt Normal file
View File

@@ -0,0 +1,14 @@
# UV-compatible requirements
# Generated from requirements.txt
flask==2.3.2
flask-cors==3.0.10
opencv-python==4.7.0.72
numpy==1.24.3
pillow==9.5.0
segment-anything-2==0.1.0
torch==2.0.1
torchvision==0.15.2
moviepy==1.0.3
python-dotenv==1.0.0
werkzeug==2.3.7

8
requirements.txt Normal file
View File

@@ -0,0 +1,8 @@
flask==2.3.2
flask-cors==3.0.10
opencv-python==4.7.0.72
numpy<2.0
pillow==9.5.0
torch==2.2.0
torchvision==0.17.0
python-dotenv==1.0.0

BIN
sam_vit_b_01ec64.pth Normal file

Binary file not shown.

485
templates/index.html Normal file
View File

@@ -0,0 +1,485 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Video Object Segmentation with SAM2</title>
<style>
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
min-height: 100vh;
padding: 20px;
}
.container {
max-width: 1200px;
margin: 0 auto;
background: white;
border-radius: 15px;
box-shadow: 0 10px 30px rgba(0, 0, 0, 0.2);
overflow: hidden;
}
header {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 30px;
text-align: center;
}
h1 {
font-size: 2.5em;
margin-bottom: 10px;
}
.subtitle {
font-size: 1.1em;
opacity: 0.9;
}
.content {
padding: 30px;
}
.upload-section {
margin-bottom: 30px;
padding: 20px;
background: #f8f9fa;
border-radius: 10px;
border: 2px dashed #dee2e6;
}
.upload-btn {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
border: none;
padding: 15px 30px;
font-size: 1.1em;
border-radius: 8px;
cursor: pointer;
transition: transform 0.3s;
}
.upload-btn:hover {
transform: translateY(-2px);
}
.preview-section {
margin: 30px 0;
text-align: center;
}
.preview-container {
position: relative;
display: inline-block;
}
.preview-image {
max-width: 100%;
max-height: 500px;
border: 3px solid #667eea;
border-radius: 10px;
cursor: crosshair;
}
.segmentation-section {
margin: 30px 0;
padding: 20px;
background: #f8f9fa;
border-radius: 10px;
}
.instructions {
background: #e3f2fd;
padding: 15px;
border-radius: 8px;
margin: 20px 0;
border-left: 4px solid #2196f3;
}
.btn {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
border: none;
padding: 12px 25px;
font-size: 1em;
border-radius: 8px;
cursor: pointer;
margin: 5px;
transition: transform 0.3s;
}
.btn:hover {
transform: translateY(-2px);
}
.btn-secondary {
background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%);
}
.status {
margin: 20px 0;
padding: 15px;
border-radius: 8px;
text-align: center;
font-weight: bold;
}
.status-success {
background: #d4edda;
color: #155724;
}
.status-error {
background: #f8d7da;
color: #721c24;
}
.status-info {
background: #d1ecf1;
color: #0c5460;
}
.results-section {
margin: 30px 0;
}
.video-preview {
max-width: 100%;
border-radius: 10px;
border: 3px solid #667eea;
}
.hidden {
display: none;
}
.point-marker {
position: absolute;
width: 12px;
height: 12px;
background: red;
border-radius: 50%;
border: 2px solid white;
transform: translate(-50%, -50%);
pointer-events: none;
}
.loading {
display: inline-block;
width: 50px;
height: 50px;
border: 5px solid #f3f3f3;
border-top: 5px solid #667eea;
border-radius: 50%;
animation: spin 1s linear infinite;
}
@keyframes spin {
0% { transform: rotate(0deg); }
100% { transform: rotate(360deg); }
}
</style>
</head>
<body>
<div class="container">
<header>
<h1>🎥 Video Object Segmentation with SAM2</h1>
<p class="subtitle">Upload a video, click on objects, and let AI segment them out!</p>
</header>
<div class="content">
<div class="upload-section">
<h2>📤 Upload Your Video</h2>
<p>Supported formats: MP4, AVI, MOV, MKV</p>
<input type="file" id="videoUpload" accept=".mp4,.avi,.mov,.mkv" style="display: none;">
<button class="upload-btn" onclick="document.getElementById('videoUpload').click()">
📁 Select Video File
</button>
<p id="fileInfo" style="margin-top: 15px;"></p>
</div>
<div class="preview-section hidden" id="previewSection">
<h2>🖼️ Preview & Select Object</h2>
<div class="instructions">
<strong>✨ Instructions:</strong> Click on the object you want to segment in the preview image below.
You can click multiple points to help the AI better understand what to segment.
</div>
<div class="preview-container">
<img id="previewImage" class="preview-image" alt="Video Preview">
</div>
<div style="margin-top: 20px;">
<button class="btn" onclick="clearPoints()">🗑️ Clear Points</button>
<button class="btn btn-secondary" onclick="segmentObject()">✂️ Segment Object</button>
</div>
</div>
<div class="segmentation-section hidden" id="segmentationSection">
<h2>⚙️ Processing</h2>
<div class="status status-info" id="processingStatus">
<div class="loading"></div>
<p style="margin-top: 15px;">Segmenting your video... This may take a while depending on video length.</p>
</div>
</div>
<div class="results-section hidden" id="resultsSection">
<h2>🎉 Results</h2>
<div class="status status-success">
<p>Segmentation completed successfully!</p>
</div>
<div style="text-align: center; margin: 20px 0;">
<video id="resultVideo" class="video-preview" controls autoplay loop muted playsinline>
Your browser does not support the video tag.
</video>
</div>
<div style="text-align: center; margin: 20px 0;">
<button class="btn" onclick="downloadResult()">💾 Download Result</button>
<button class="btn btn-secondary" onclick="resetApp()">🔄 Start Over</button>
</div>
</div>
</div>
</div>
<script>
// Global variables
let currentFilename = '';
let selectedPoints = [];
let imageWidth, imageHeight;
// DOM elements
const videoUpload = document.getElementById('videoUpload');
const fileInfo = document.getElementById('fileInfo');
const previewSection = document.getElementById('previewSection');
const previewImage = document.getElementById('previewImage');
const segmentationSection = document.getElementById('segmentationSection');
const resultsSection = document.getElementById('resultsSection');
const resultVideo = document.getElementById('resultVideo');
const processingStatus = document.getElementById('processingStatus');
// Event listeners
videoUpload.addEventListener('change', handleFileUpload);
previewImage.addEventListener('click', handleImageClick);
function handleFileUpload(event) {
const file = event.target.files[0];
if (!file) return;
console.log('File selected:', file.name);
console.log('File type:', file.type);
console.log('File size:', file.size, 'bytes');
fileInfo.textContent = `Selected: ${file.name} (${(file.size / 1048576).toFixed(2)} MB)`;
currentFilename = file.name;
// Upload file
const formData = new FormData();
formData.append('file', file);
console.log('Uploading file...');
fetch('/upload', {
method: 'POST',
body: formData
})
.then(response => response.json())
.then(data => {
if (data.error) {
showError(data.error);
return;
}
console.log('Upload response:', data);
// Show preview section
previewSection.classList.remove('hidden');
if (data.preview) {
// Display preview image
previewImage.src = 'data:image/jpeg;base64,' + data.preview;
// Store image dimensions for point conversion
previewImage.onload = function() {
imageWidth = previewImage.naturalWidth;
imageHeight = previewImage.naturalHeight;
console.log('Preview image loaded:', imageWidth, 'x', imageHeight);
};
previewImage.onerror = function() {
console.error('Failed to load preview image');
previewImage.src = '';
previewImage.alt = 'Preview failed to load';
};
} else {
console.log('No preview available');
previewImage.src = '';
previewImage.alt = 'No preview available';
// You might want to show a message to the user here
}
})
.catch(error => {
console.error('Upload error:', error);
showError('Upload failed: ' + error.message);
});
}
function handleImageClick(event) {
// Get click coordinates relative to the image
const rect = previewImage.getBoundingClientRect();
const x = event.clientX - rect.left;
const y = event.clientY - rect.top;
// Convert to original image coordinates
const scaleX = imageWidth / rect.width;
const scaleY = imageHeight / rect.height;
const originalX = Math.round(x * scaleX);
const originalY = Math.round(y * scaleY);
// Add point
selectedPoints.push([originalX, originalY]);
// Add visual marker
addPointMarker(x, y);
console.log('Selected points:', selectedPoints);
}
function addPointMarker(x, y) {
const marker = document.createElement('div');
marker.className = 'point-marker';
marker.style.left = x + 'px';
marker.style.top = y + 'px';
previewImage.parentNode.appendChild(marker);
}
function clearPoints() {
selectedPoints = [];
const markers = document.querySelectorAll('.point-marker');
markers.forEach(marker => marker.remove());
}
function segmentObject() {
if (selectedPoints.length === 0) {
alert('Please select at least one point on the object you want to segment.');
return;
}
if (!currentFilename) {
showError('No video file selected');
return;
}
// Show processing section
previewSection.classList.add('hidden');
segmentationSection.classList.remove('hidden');
// Send segmentation request
fetch('/segment', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
filename: currentFilename,
points: selectedPoints
})
})
.then(response => {
console.log('Segmentation response status:', response.status);
return response.json();
})
.then(data => {
console.log('Segmentation response data:', data);
if (data.error) {
console.error('Segmentation error:', data.error);
showError(data.error);
return;
}
// Show results section
segmentationSection.classList.add('hidden');
resultsSection.classList.remove('hidden');
// Display result video
const videoUrl = '/segmented/' + data.output_filename;
console.log('Setting video source to:', videoUrl);
resultVideo.src = videoUrl;
// Add error handling for video
resultVideo.onerror = function() {
console.error('Video playback error');
console.error('Video error code:', resultVideo.error ? resultVideo.error.code : 'unknown');
showError('Could not load video. Error code: ' + (resultVideo.error ? resultVideo.error.code : 'unknown'));
};
resultVideo.onloadeddata = function() {
console.log('Video loaded successfully');
console.log('Video duration:', resultVideo.duration, 'seconds');
console.log('Video readyState:', resultVideo.readyState);
};
resultVideo.load();
console.log('Video load() called');
// Try to play the video
setTimeout(() => {
resultVideo.play().catch(e => {
console.error('Autoplay failed:', e);
// This is expected in some browsers without user interaction
});
}, 100);
})
.catch(error => {
showError('Segmentation failed: ' + error.message);
});
}
function downloadResult() {
if (!resultVideo.src) return;
const link = document.createElement('a');
link.href = resultVideo.src;
link.download = resultVideo.src.split('/').pop();
document.body.appendChild(link);
link.click();
document.body.removeChild(link);
}
function resetApp() {
// Reset all state
currentFilename = '';
selectedPoints = [];
fileInfo.textContent = '';
videoUpload.value = '';
// Hide all sections
previewSection.classList.add('hidden');
segmentationSection.classList.add('hidden');
resultsSection.classList.add('hidden');
// Clear markers and video
const markers = document.querySelectorAll('.point-marker');
markers.forEach(marker => marker.remove());
previewImage.src = '';
resultVideo.src = '';
}
function showError(message) {
processingStatus.className = 'status status-error';
processingStatus.innerHTML = `<p>${message}</p>`;
// Show reset button
setTimeout(() => {
processingStatus.innerHTML += '<button class="btn" onclick="resetApp()" style="margin-top: 15px;">Try Again</button>';
}, 1000);
}
</script>
</body>
</html>

8
uv.lock generated Normal file
View File

@@ -0,0 +1,8 @@
version = 1
revision = 3
requires-python = ">=3.12"
[[package]]
name = "censorall"
version = "0.1.0"
source = { virtual = "." }