Inference Logoinference.sh

Troubleshooting

Common issues and solutions for inference.sh apps.


Import Errors

"ModuleNotFoundError" in Production

Solutions:

  1. Add __init__.py files to all packages

  2. Add current directory to Python path:

python
1import sys, os2sys.path.append(os.path.dirname(os.path.abspath(__file__)))
  1. For local packages, use editable installs:
txt
1-e ./local_package_directory

Memory Issues

"CUDA out of memory"

Solutions:

  1. Reduce batch size
  2. Use mixed precision: model.to(dtype=torch.float16)
  3. Enable gradient checkpointing: model.gradient_checkpointing_enable()
  4. Clear cache: torch.cuda.empty_cache()
  5. Increase VRAM in inf.yml

Memory Leaks

Clean up after each request:

python
1import gc, torch23async def run(self, input_data):4    result = self.process(input_data)5    if torch.cuda.is_available():6        torch.cuda.empty_cache()7    gc.collect()8    return result

Device Errors

"Expected all tensors to be on the same device"

Ensure all tensors are on the same device:

python
1input_tensor = input_tensor.to(self.device)

"CUDA not available"

  1. Check inf.yml GPU requirements:
yaml
1resources:2  gpu:3    count: 14    vram: 24  # 24GB
  1. Use device detection:
python
1from accelerate import Accelerator2device = Accelerator().device

Model Loading Errors

"File not found" After Download

Don't assume file paths:

python
1model_path = snapshot_download(repo_id="org/model")2config_path = os.path.join(model_path, "config.yaml")3if os.path.exists(config_path):4    # Load config

"Token required for gated model"

Add HF_TOKEN to secrets:

yaml
1secrets:2  - key: HF_TOKEN3    description: HuggingFace token for gated models

File Path Issues

Temporary Files Deleted Too Early

Use delete=False:

python
1with tempfile.NamedTemporaryFile(suffix='.jpg', delete=False) as tmp:2    output_path = tmp.name

Path Separators

Use os.path.join:

python
1# ✅ Good2path = os.path.join("models", "config", "settings.json")

Dependency Issues

Version Conflicts

Pin compatible versions:

txt
1torch==2.6.02numpy>=1.23.5,<2

Flash Attention Build Errors

Use pre-built wheels in requirements2.txt


Debug Mode

Add logging:

python
1import logging2logging.basicConfig(level=logging.DEBUG)34async def setup(self, config):5    logging.debug(f"Config: {config}")6    logging.info("Starting model load...")

Node.js-Specific Issues

"ERR_MODULE_NOT_FOUND"

Ensure "type": "module" is in your package.json and use file extensions in imports:

javascript
1import { helper } from "./helper.js"; // .js required for ESM

"Cannot use import statement outside a module"

Your package.json must have:

json
1{2  "type": "module"3}

Native Module Build Errors

Some packages (e.g., sharp, canvas) need system libraries. Add them to packages.txt:

code
1libvips-dev

Next

Best Practices - Optimization patterns

we use cookies

we use cookies to ensure you get the best experience on our website. for more information on how we use cookies, please see our cookie policy.

by clicking "accept", you agree to our use of cookies.
learn more.