# Video Object Segmentation with SAM2 - Setup Guide ## Quick Start (macOS/Linux) ### 1. Install uv ```bash pip install uv ``` ### 2. Create virtual environment and install dependencies ```bash uv venv .venv source .venv/bin/activate uv pip install -r requirements.txt ``` ### 3. Install SAM manually ```bash # Clone the segment-anything repository git clone https://github.com/facebookresearch/segment-anything.git cd segment-anything # Install SAM in development mode pip install -e . # Download the ViT-B model checkpoint (recommended) wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth mv sam_vit_b_01ec64.pth .. cd .. # Clean up rm -rf segment-anything ``` ### 4. Create directories ```bash mkdir -p uploads segmented ``` ### 5. Run the application ```bash python app.py ``` ## Project Structure ``` video-segmentation-sam2/ ├── app.py # Main Flask application ├── requirements.txt # Python dependencies ├── .env # Configuration ├── templates/ │ └── index.html # Web interface ├── uploads/ # Uploaded videos (created automatically) ├── segmented/ # Processed videos (created automatically) └── sam_vit_b_01ec64.pth # SAM2 model checkpoint ``` ## Configuration Edit `.env` file to customize: ```env FLASK_ENV=development UPLOAD_FOLDER=uploads SEGMENTED_FOLDER=segmented ALLOWED_EXTENSIONS=.mp4,.avi,.mov,.mkv SAM_MODEL_SIZE=vit_b # Options: vit_b, vit_l, vit_h ``` ## Troubleshooting ### NumPy compatibility issues If you see errors about NumPy 2.x compatibility: ``` A module that was compiled using NumPy 1.x cannot be run in NumPy 2.3.5 ``` **Solution:** The requirements.txt already specifies `numpy<2.0` to avoid this issue. Make sure you: 1. Delete your virtual environment: `rm -rf .venv` 2. Recreate it: `uv venv .venv` 3. Reinstall dependencies: `uv pip install -r requirements.txt` ### SAM2 not found If you get `ImportError: SAM2 is not installed`, make sure you: 1. Cloned the segment-anything repository 2. Ran `pip install -e .` from the segment-anything directory 3. Have the checkpoint file in the root directory ### CUDA not available If you don't have a GPU, the app will use CPU (slower). For better performance: 1. Install CUDA toolkit 2. Install cuDNN 3. Make sure `torch.cuda.is_available()` returns True ### Port already in use If port 5000 is busy: 1. Change the port in `app.py` (last line) 2. Or kill the process using port 5000: ```bash lsof -i :5000 kill -9 ``` ## Model Options | Model | Checkpoint File | Size | Speed | Accuracy | Best For | |-------|----------------|------|-------|----------|----------| | **ViT-B** | `sam_vit_b_01ec64.pth` | Smallest | Fastest | Good | Testing, quick results, lower-end hardware | | **ViT-L** | `sam_vit_l_0b3195.pth` | Medium | Medium | Better | Balanced performance/quality | | **ViT-H** | `sam_vit_h_4b8939.pth` | Largest | Slowest | Best | High-quality results, powerful hardware | To change models: 1. Download the desired checkpoint 2. Update `SAM_MODEL_SIZE` in `.env` 3. Restart the application ## Using the Application 1. **Upload**: Select a video file (MP4, AVI, MOV, MKV) 2. **Preview**: See the first frame of your video 3. **Select**: Click on the object you want to segment 4. **Process**: Click "Segment Object" to start processing 5. **Download**: Get your segmented video ## Performance Tips - **GPU Acceleration**: SAM2 runs much faster with CUDA - **Video Length**: Shorter videos process faster - **Resolution**: Lower resolutions are quicker to process - **Points**: 3-5 well-placed points usually work best