| CARVIEW |
Select Language
HTTP/2 200
date: Sun, 28 Dec 2025 23:18:16 GMT
content-type: text/html; charset=utf-8
server: cloudflare
last-modified: Fri, 25 Jun 2021 14:53:08 GMT
access-control-allow-origin: *
expires: Sun, 28 Dec 2025 23:28:16 GMT
cache-control: max-age=600
report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=eL0a8RJ1YffBQxfjKqYsFoiw5D5Hqvb79hhFjMoVjXDgVDIxAS28JxrX0lML8FS7eo%2BZPfniA8gCMmXmpW86Hj3BtlOSXHa8vs0bLX3My70%3D"}]}
x-proxy-cache: MISS
x-github-request-id: 7215:2118F1:7FEE12:8FC9C2:6951BAB7
nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
age: 0
via: 1.1 varnish
x-served-by: cache-bom-vanm7210070-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766963896.004735,VS0,VE200
vary: Accept-Encoding
x-fastly-request-id: f2b8a3a477d737010cb9461d66ed5da0b30a1303
cf-cache-status: DYNAMIC
content-encoding: gzip
cf-ray: 9b54c69dca3f47bf-BOM
alt-svc: h3=":443"; ma=86400
PyTorchVideo · A deep learning library for video understanding research 



A deep learning library for video understanding research
Based on PyTorch
Based on PyTorch
Built using PyTorch. Makes it easy to use all the PyTorch-ecosystem components.
Reproducible Model Zoo
Reproducible Model Zoo
Variety of state of the art pretrained video models and their associated benchmarks that are ready to use.
Efficient Video Components
Efficient Video Components
Video-focused fast and efficient components that are easy to use. Supports accelerated inference on hardware.
Get Started
- Install pytorchvideo (Confirm requirements following the instructions here)
pip install pytorchvideo - Try Video classification with Model Zoo (For detailed instructions, refer to the PyTorchVideo Model Zoo Inference Tutorial
# Import all the required components ... # Load pre-trained model model = torch.hub.load('facebookresearch/pytorchvideo', 'slow_r50', pretrained=True) # Load video video = EncodedVideo.from_path('some_video.avi') # Compose video data transforms transform = ApplyTransformToKey( key="video", transform=Compose( [ UniformTemporalSubsample(num_frames), Lambda(lambda x: x/255.0), NormalizeVideo(mean, std), ShortSideScale( size=side_size ), CenterCropVideo(crop_size=(crop_size, crop_size)) ] ), ) # Get clip clip_start_sec = 0.0 # secs clip_duration = 2.0 # secs video_data = video.get_clip(start_sec=clip_start_sec, end_sec=clip_start_sec + clip_duration) video_data = transform(video_data) # Generate top 5 predictions preds = torch.nn.functional.softmax(preds) pred_class_ids = preds.topk(k=5).indices