You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Python package (and website) to automatically attempt to find GitHub
repositories that are similar to academic papers.
Installation
Stable Release:pip install papers-without-code Development Head:pip install git+https://github.com/evamaxfield/papers-without-code.git
Usage
Provide a DOI, SemanticScholarID, CorpusID, ArXivID, ACL,
or URL from semanticscholar.org, arxiv.org, aclweb.org,
acm.org, or biorxiv.org. DOIs can be provided as is.
All other IDs should be given with their type, for example:
doi:10.18653/v1/2020.acl-main.447
or CorpusID:202558505 or url:https://arxiv.org/abs/2004.07180.
CLI
pip install papers-without-code
pwoc query
# or pwoc path/to/file.pdf
⚠️ Prior to using PWOC with a PDF you must be logged in to Docker CLI via docker login
because we automatically fetch, spin up, and tear down containers for processing. ⚠️
How it Works
In short, we pass the query on to the Semantic Scholar search API
which provides us basic details about the paper. We use
a prompted gpt-3.5-turbo with langchain to extract keywords from the
title and abstract. We then make multiple threaded requests to GitHub's API
for repositories which match the keywords. Once we have all the possible repositories
back, we rank them by similarity between the repository's README and the paper's
abstract (or if not available, it's title).
When using Papers without Code locally and providing a filepath, the only change to
this workflow, is paper details gathering. When local and providing a filepath,
we use GROBID to extract the
title, abstract, and author list.