Initial Commit

2025-12-11 23:26:21 +01:00
commit a23f189267
12 changed files with 1255 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,210 @@
+# Video Object Segmentation with SAM2
+
+A web application that allows you to upload videos, click on objects, and segment them out using Meta's SAM2 (Segment Anything Model 2) AI model.
+
+## Features
+
+- 📤 Upload video files (MP4, AVI, MOV, MKV)
+- 🖼️ Preview first frame of the video
+- 🎯 Click on objects to select them for segmentation
+- ✂️ AI-powered object segmentation using SAM2
+- 🎥 Download segmented video results
+- 🎨 Beautiful, responsive user interface
+
+## Requirements
+
+- **Python 3.8-3.12** (tested and compatible)
+- PyTorch 2.2.0+ with CUDA (recommended for GPU acceleration)
+- Flask
+- OpenCV
+- NumPy
+- Segment Anything Model 2
+
+**Python Version Compatibility:**
+- ✅ Python 3.8, 3.9, 3.10, 3.11, 3.12 all supported
+- 🔄 Automatic torch version selection based on Python version
+- 💡 Python 3.12 users: Use the updated requirements (torch 2.2.0+)
+
+## Why use uv?
+
+We recommend using **uv** for this project because:
+
+✅ **Faster dependency resolution**: uv is significantly faster than pip
+✅ **Better virtual environment management**: Cleaner and more reliable venvs
+✅ **Deterministic builds**: More consistent dependency resolution
+✅ **Modern Python tooling**: Built with Rust for performance
+✅ **Better compatibility**: Handles complex dependency trees better
+
+If you're working on Python projects, uv is a great modern alternative to pip + virtualenv!
+
+## Installation
+
+### 1. Clone the repository
+
+```bash
+git clone https://github.com/yourusername/video-segmentation-sam2.git
+cd video-segmentation-sam2
+```
+
+### 2. Install dependencies (using uv - recommended)
+
+First, install uv (a fast Python package installer and virtual environment manager):
+
+```bash
+# Install uv
+pip install uv
+
+# Create virtual environment and install dependencies
+uv venv .venv
+source .venv/bin/activate  # On Windows: .\.venv\Scripts\activate
+uv pip install -r requirements.txt
+```
+
+### 3. Install SAM2 manually
+
+SAM2 needs to be installed manually from GitHub:
+
+```bash
+# Clone the segment-anything repository
+git clone https://github.com/facebookresearch/segment-anything.git
+cd segment-anything
+
+# Install SAM2 in development mode
+pip install -e .
+
+# Download the model checkpoint (ViT-B recommended)
+wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
+mv sam_vit_b_01ec64.pth ..
+cd ..
+
+# Clean up
+rm -rf segment-anything
+```
+
+This will:
+- Create a virtual environment using uv
+- Install all Python dependencies
+- Install SAM2 from source
+- Download the ViT-B model checkpoint
+- Set up necessary directories
+
+### 3. Alternative: Standard pip installation
+
+If you prefer not to use uv:
+
+```bash
+pip install -r requirements.txt
+python setup.py
+```
+
+### 3. Download SAM2 model weights
+
+The application now uses **ViT-B** (smaller, faster model) by default. You need the file `sam_vit_b_01ec64.pth` in the root directory.
+
+Download from: [https://github.com/facebookresearch/segment-anything](https://github.com/facebookresearch/segment-anything)
+
+Or use the following command:
+
+```bash
+wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
+```
+
+**Model Options:**
+- `vit_b` (default): Fastest, good for testing - `sam_vit_b_01ec64.pth`
+- `vit_l`: Medium size/performance - `sam_vit_l_0b3195.pth`
+- `vit_h`: Best accuracy, largest - `sam_vit_h_4b8939.pth`
+
+You can change the model by modifying `SAM_MODEL_SIZE` in `app.py`.
+
+### 4. Run the application
+
+```bash
+python app.py
+```
+
+The application will start on `http://localhost:5000`
+
+## Usage
+
+1. **Upload a video**: Click the "Select Video File" button and choose a video file
+2. **Select object**: Click on the object you want to segment in the preview image
+3. **Add more points**: Click additional points to help the AI better understand the object
+4. **Segment**: Click "Segment Object" to start the segmentation process
+5. **Download**: Once processing is complete, preview and download your segmented video
+
+## Configuration
+
+You can configure the application by editing the `.env` file:
+
+```env
+FLASK_ENV=development
+UPLOAD_FOLDER=uploads
+SEGMENTED_FOLDER=segmented
+ALLOWED_EXTENSIONS=.mp4,.avi,.mov,.mkv
+```
+
+## Technical Details
+
+### Backend
+
+- **Flask**: Web framework
+- **SAM2**: Segment Anything Model 2 for object segmentation
+- **OpenCV**: Video processing and frame manipulation
+- **PyTorch**: Deep learning framework for running SAM2
+
+### Frontend
+
+- **HTML5/CSS3**: Responsive user interface
+- **JavaScript**: Interactive point selection and AJAX requests
+- **Base64 encoding**: For preview image transfer
+
+### Processing Pipeline
+
+1. Video upload and first frame extraction
+2. User selects points on the object to segment
+3. SAM2 processes each frame with the selected points
+4. Masks are applied to each frame
+5. Processed frames are combined into a new video
+
+## Performance Considerations
+
+- **GPU recommended**: SAM2 runs much faster with CUDA-enabled GPU
+- **Video length**: Longer videos will take more time to process
+- **Resolution**: Higher resolution videos require more processing power
+- **Points selection**: More points can help with complex objects but may slow down processing
+
+## Troubleshooting
+
+### Common Issues
+
+**Issue: SAM2 model not found**
+- Solution: Download the model checkpoint and place it in the root directory
+
+**Issue: CUDA out of memory**
+- Solution: Reduce video resolution or use smaller batch sizes
+
+**Issue: Slow processing on CPU**
+- Solution: Use a machine with GPU or reduce video resolution
+
+**Issue: Video format not supported**
+- Solution: Convert your video to MP4 format
+
+## License
+
+This project is licensed under the MIT License. The SAM2 model is provided by Meta Research under its own license.
+
+## Acknowledgements
+
+- Meta Research for the Segment Anything Model
+- Flask team for the web framework
+- OpenCV team for computer vision tools
+
+## Future Improvements
+
+- [ ] Add support for multiple object segmentation
+- [ ] Implement background removal options
+- [ ] Add video trimming functionality
+- [ ] Support for real-time preview
+- [ ] Batch processing of multiple videos
+- [ ] Advanced segmentation parameters (threshold, etc.)
+- [ ] Cloud deployment options