SEART Data Hub

The SEART Data Hub platform allows to easily create large-scale datasets that can be used to either run empirical MSR studies or to train Deep Learning models to automate software engineering tasks.

dl4se-model: A module containing domain model classes used for mapping the relational database structure to the programming environment;
dl4se-analyzer: A module containing implementations of code analysis operations running on tree-sitter;
dl4se-transformer: A module containing implementations of code transformation operations running on tree-sitter;
dl4se-crawler: A standalone crawler application that we use to mine source code from GitHub repositories indexed by GitHub Search;
dl4se-server: A Spring Boot server application that acts as our platform back-end;
dl4se-spring: Common Spring Boot configuration and utilities used in both the server and the crawler;
dl4se-website: A front-end web-application written in Vue.

Installation and Usage

This section will detail the necessary actions for setting up and running the project locally on your machine.

Environment

Database

Usage

Dockerization

License

MIT

FAQ

How do you implement language-specific analysis heuristics?

Heuristics used to identify test code in Java and Python can be found here and here. Heuristics used to identify boilerplate code can be found here and here respectively.

How can I request a feature or ask a question?

If you have ideas for a feature you would like to see implemented or if you have any questions, we encourage you to create a new discussion. By initiating a discussion, you can engage with the community and our team, and we'll respond promptly to address your queries or consider your feature requests.

How can I report a bug?

To report any issues or bugs you encounter, please create a new issue. Providing detailed information about the problem you're facing will help us understand and address it more effectively. Rest assured, we are committed to promptly reviewing and responding to the issues you raise, working collaboratively to resolve any bugs and improve the overall user experience.

Name		Name	Last commit message	Last commit date
Latest commit History 2,640 Commits
.idea		.idea
.vscode		.vscode
database		database
deployment		deployment
dl4se-analyzer		dl4se-analyzer
dl4se-crawler		dl4se-crawler
dl4se-model		dl4se-model
dl4se-server		dl4se-server
dl4se-spring		dl4se-spring
dl4se-transformer		dl4se-transformer
dl4se-website		dl4se-website
http		http
liquibase		liquibase
nginx		nginx
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
README_DB.md		README_DB.md
README_DOCKER.md		README_DOCKER.md
README_ENV.md		README_ENV.md
README_RUN.md		README_RUN.md
lombok.config		lombok.config
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SEART Data Hub

Contents

Installation and Usage

Environment

Database

Usage

Dockerization

License

FAQ

How do you implement language-specific analysis heuristics?

How can I request a feature or ask a question?

How can I report a bug?

About

Uh oh!

Uh oh!

Contributors 3

Languages

License

seart-group/DL4SE

Folders and files

Latest commit

History

Repository files navigation

SEART Data Hub

Contents

Installation and Usage

Environment

Database

Usage

Dockerization

License

FAQ

How do you implement language-specific analysis heuristics?

How can I request a feature or ask a question?

How can I report a bug?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 3

Languages