Initial Commit
This commit is contained in:
210
README.md
Normal file
210
README.md
Normal file
@@ -0,0 +1,210 @@
|
||||
# Video Object Segmentation with SAM2
|
||||
|
||||
A web application that allows you to upload videos, click on objects, and segment them out using Meta's SAM2 (Segment Anything Model 2) AI model.
|
||||
|
||||
## Features
|
||||
|
||||
- 📤 Upload video files (MP4, AVI, MOV, MKV)
|
||||
- 🖼️ Preview first frame of the video
|
||||
- 🎯 Click on objects to select them for segmentation
|
||||
- ✂️ AI-powered object segmentation using SAM2
|
||||
- 🎥 Download segmented video results
|
||||
- 🎨 Beautiful, responsive user interface
|
||||
|
||||
## Requirements
|
||||
|
||||
- **Python 3.8-3.12** (tested and compatible)
|
||||
- PyTorch 2.2.0+ with CUDA (recommended for GPU acceleration)
|
||||
- Flask
|
||||
- OpenCV
|
||||
- NumPy
|
||||
- Segment Anything Model 2
|
||||
|
||||
**Python Version Compatibility:**
|
||||
- ✅ Python 3.8, 3.9, 3.10, 3.11, 3.12 all supported
|
||||
- 🔄 Automatic torch version selection based on Python version
|
||||
- 💡 Python 3.12 users: Use the updated requirements (torch 2.2.0+)
|
||||
|
||||
## Why use uv?
|
||||
|
||||
We recommend using **uv** for this project because:
|
||||
|
||||
✅ **Faster dependency resolution**: uv is significantly faster than pip
|
||||
✅ **Better virtual environment management**: Cleaner and more reliable venvs
|
||||
✅ **Deterministic builds**: More consistent dependency resolution
|
||||
✅ **Modern Python tooling**: Built with Rust for performance
|
||||
✅ **Better compatibility**: Handles complex dependency trees better
|
||||
|
||||
If you're working on Python projects, uv is a great modern alternative to pip + virtualenv!
|
||||
|
||||
## Installation
|
||||
|
||||
### 1. Clone the repository
|
||||
|
||||
```bash
|
||||
git clone https://github.com/yourusername/video-segmentation-sam2.git
|
||||
cd video-segmentation-sam2
|
||||
```
|
||||
|
||||
### 2. Install dependencies (using uv - recommended)
|
||||
|
||||
First, install uv (a fast Python package installer and virtual environment manager):
|
||||
|
||||
```bash
|
||||
# Install uv
|
||||
pip install uv
|
||||
|
||||
# Create virtual environment and install dependencies
|
||||
uv venv .venv
|
||||
source .venv/bin/activate # On Windows: .\.venv\Scripts\activate
|
||||
uv pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### 3. Install SAM2 manually
|
||||
|
||||
SAM2 needs to be installed manually from GitHub:
|
||||
|
||||
```bash
|
||||
# Clone the segment-anything repository
|
||||
git clone https://github.com/facebookresearch/segment-anything.git
|
||||
cd segment-anything
|
||||
|
||||
# Install SAM2 in development mode
|
||||
pip install -e .
|
||||
|
||||
# Download the model checkpoint (ViT-B recommended)
|
||||
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
|
||||
mv sam_vit_b_01ec64.pth ..
|
||||
cd ..
|
||||
|
||||
# Clean up
|
||||
rm -rf segment-anything
|
||||
```
|
||||
|
||||
This will:
|
||||
- Create a virtual environment using uv
|
||||
- Install all Python dependencies
|
||||
- Install SAM2 from source
|
||||
- Download the ViT-B model checkpoint
|
||||
- Set up necessary directories
|
||||
|
||||
### 3. Alternative: Standard pip installation
|
||||
|
||||
If you prefer not to use uv:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
python setup.py
|
||||
```
|
||||
|
||||
### 3. Download SAM2 model weights
|
||||
|
||||
The application now uses **ViT-B** (smaller, faster model) by default. You need the file `sam_vit_b_01ec64.pth` in the root directory.
|
||||
|
||||
Download from: [https://github.com/facebookresearch/segment-anything](https://github.com/facebookresearch/segment-anything)
|
||||
|
||||
Or use the following command:
|
||||
|
||||
```bash
|
||||
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
|
||||
```
|
||||
|
||||
**Model Options:**
|
||||
- `vit_b` (default): Fastest, good for testing - `sam_vit_b_01ec64.pth`
|
||||
- `vit_l`: Medium size/performance - `sam_vit_l_0b3195.pth`
|
||||
- `vit_h`: Best accuracy, largest - `sam_vit_h_4b8939.pth`
|
||||
|
||||
You can change the model by modifying `SAM_MODEL_SIZE` in `app.py`.
|
||||
|
||||
### 4. Run the application
|
||||
|
||||
```bash
|
||||
python app.py
|
||||
```
|
||||
|
||||
The application will start on `http://localhost:5000`
|
||||
|
||||
## Usage
|
||||
|
||||
1. **Upload a video**: Click the "Select Video File" button and choose a video file
|
||||
2. **Select object**: Click on the object you want to segment in the preview image
|
||||
3. **Add more points**: Click additional points to help the AI better understand the object
|
||||
4. **Segment**: Click "Segment Object" to start the segmentation process
|
||||
5. **Download**: Once processing is complete, preview and download your segmented video
|
||||
|
||||
## Configuration
|
||||
|
||||
You can configure the application by editing the `.env` file:
|
||||
|
||||
```env
|
||||
FLASK_ENV=development
|
||||
UPLOAD_FOLDER=uploads
|
||||
SEGMENTED_FOLDER=segmented
|
||||
ALLOWED_EXTENSIONS=.mp4,.avi,.mov,.mkv
|
||||
```
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Backend
|
||||
|
||||
- **Flask**: Web framework
|
||||
- **SAM2**: Segment Anything Model 2 for object segmentation
|
||||
- **OpenCV**: Video processing and frame manipulation
|
||||
- **PyTorch**: Deep learning framework for running SAM2
|
||||
|
||||
### Frontend
|
||||
|
||||
- **HTML5/CSS3**: Responsive user interface
|
||||
- **JavaScript**: Interactive point selection and AJAX requests
|
||||
- **Base64 encoding**: For preview image transfer
|
||||
|
||||
### Processing Pipeline
|
||||
|
||||
1. Video upload and first frame extraction
|
||||
2. User selects points on the object to segment
|
||||
3. SAM2 processes each frame with the selected points
|
||||
4. Masks are applied to each frame
|
||||
5. Processed frames are combined into a new video
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- **GPU recommended**: SAM2 runs much faster with CUDA-enabled GPU
|
||||
- **Video length**: Longer videos will take more time to process
|
||||
- **Resolution**: Higher resolution videos require more processing power
|
||||
- **Points selection**: More points can help with complex objects but may slow down processing
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Issue: SAM2 model not found**
|
||||
- Solution: Download the model checkpoint and place it in the root directory
|
||||
|
||||
**Issue: CUDA out of memory**
|
||||
- Solution: Reduce video resolution or use smaller batch sizes
|
||||
|
||||
**Issue: Slow processing on CPU**
|
||||
- Solution: Use a machine with GPU or reduce video resolution
|
||||
|
||||
**Issue: Video format not supported**
|
||||
- Solution: Convert your video to MP4 format
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the MIT License. The SAM2 model is provided by Meta Research under its own license.
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
- Meta Research for the Segment Anything Model
|
||||
- Flask team for the web framework
|
||||
- OpenCV team for computer vision tools
|
||||
|
||||
## Future Improvements
|
||||
|
||||
- [ ] Add support for multiple object segmentation
|
||||
- [ ] Implement background removal options
|
||||
- [ ] Add video trimming functionality
|
||||
- [ ] Support for real-time preview
|
||||
- [ ] Batch processing of multiple videos
|
||||
- [ ] Advanced segmentation parameters (threshold, etc.)
|
||||
- [ ] Cloud deployment options
|
||||
Reference in New Issue
Block a user