HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 date: Mon, 21 Jul 2025 20:20:25 GMT content-type: text/html; charset=utf-8 vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With etag: W/"26c26f402c68410cee0c19a8a80bd19d" cache-control: max-age=0, private, must-revalidate strict-transport-security: max-age=31536000; includeSubdomains; preload x-frame-options: deny x-content-type-options: nosniff x-xss-protection: 0 referrer-policy: no-referrer-when-downgrade content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/ server: github.com content-encoding: gzip accept-ranges: bytes set-cookie: _gh_sess=AzieT47TllBNw2BD1YEdcf5aGw6M6gXJOESaOdg6U5qet%2F%2BuAUohHfMTgsKFNGr6S%2FhZqTUWo7p2QS9ewbU6tYyhBrmJ8eLVg8xPVZnvSd%2FwYxhA%2B5SDtfZ7lk432qppQtjsQpAP60KIkn9QQQah8QdkTIr7Tp9bI4sActvEhR9K3PYRjLejkokFeSKvCEM0fZ5eXaTtTiQ%2Bu4m2GlKUFjlD1DnV72dte0pxmW9kzETAD8709IaEEzNMu4Hzr4uYT5oCGblFR0KUKKYQNMdsIQ%3D%3D--sLcHz39NnzlLoiLY--%2FhgPGFQyb1LU5vpenK9HoA%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax set-cookie: _octo=GH1.1.946269143.1753129224; Path=/; Domain=github.com; Expires=Tue, 21 Jul 2026 20:20:24 GMT; Secure; SameSite=Lax set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Tue, 21 Jul 2026 20:20:24 GMT; HttpOnly; Secure; SameSite=Lax x-github-request-id: B01E:33BC85:19990:22FB6:687EA108 GitHub - kyegomez/ScreenAI: Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

kyegomez / ScreenAI Public

generated from kyegomez/Python-Package-Template

Notifications You must be signed in to change notification settings
Fork 33
Star 353

Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"

discord.gg/GYbXvDGevY

License

MIT license

353 stars 33 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Code
Issues 1
Pull requests
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

kyegomez/ScreenAI

Branches Tags

Open more actions menu

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
screenai		screenai
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
LICENSE		LICENSE
README.md		README.md
agorabanner.png		agorabanner.png
example.py		example.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Repository files navigation

README
MIT license

Screen AI

Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding". The flow is: img + text -> patch sizes -> vit -> embed + concat -> attn + ffn -> cross attn + ffn + self attn -> to out. PAPER LINK:

Install

pip3 install screenai

Usage

import torch
from screenai.main import ScreenAI
# Create a tensor for the image
image = torch.rand(1, 3, 224, 224)
# Create a tensor for the text
text = torch.randn(1, 1, 512)
# Create an instance of the ScreenAI model with specified parameters
model = ScreenAI(
    patch_size=16,
    image_size=224,
    dim=512,
    depth=6,
    heads=8,
    vit_depth=4,
    multi_modal_encoder_depth=4,
    llm_decoder_depth=4,
    mm_encoder_ff_mult=4,
)
# Perform forward pass of the model with the given text and image tensors
out = model(text, image)
# Print the shape of the output tensor
print(out)

License

MIT

Citation

@misc{baechler2024screenai,
    title={ScreenAI: A Vision-Language Model for UI and Infographics Understanding}, 
    author={Gilles Baechler and Srinivas Sunkara and Maria Wang and Fedir Zubach and Hassan Mansoor and Vincent Etter and Victor Cărbune and Jason Lin and Jindong Chen and Abhanshu Sharma},
    year={2024},
    eprint={2402.04615},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Todo

Implement the nn.ModuleList([]) in the encoder and decoder

About

Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"

discord.gg/GYbXvDGevY

Uh oh!

There was an error while loading. Please reload this page.

Activity

Stars

353 stars

Watchers

7 watching

Forks

33 forks

Report repository

Releases

No releases published

Sponsor this project

Uh oh!

There was an error while loading. Please reload this page.

Learn more about GitHub Sponsors

Packages

No packages published

Uh oh!

There was an error while loading. Please reload this page.

Languages

Python 100.0%

Footer

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source

Uh oh!

License

kyegomez/ScreenAI

Folders and files

Latest commit

History

Repository files navigation

Screen AI

Install

Usage

License

Citation

Todo

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Languages

Packages