| CARVIEW |
Data Management
for the Internet of Things
NebulaStream is a general-purpose, end-to-end data-management system for cloud-edge-sensor environments built around three core goals:
Ease of Use
Out-of-the-box functionality for multi-modal, multi-frequency streams (e.g., alignment, inference). Enables users to focus on business logic with well-known abstractions and concepts.
Extensibility
Empower users to easily integrate custom data connectors, formats, operators, and optimizations into the system.
Efficiency
Utilize distributed heterogeneous computing devices with hardware-tailored code, adaptive execution, and the interleaved processing of data sources to handle large workloads efficiently.
NebulaStream Vision
Our goal is to process thousands of queries over millions of heterogeneous sources in a massively distributed environment. We achieve this through five core technologies:
- Heterogeneous Hardware Support: Supports a wide range of devices including different architectures (e.g., ARM, x86) and accelerators (e.g., GPUs, TPUs).
- Code Generation: Compiles every query to efficient, low-energy native code.
- In-Network Processing: Pushes operators as close as possible to the data source to reduce network traffic.
- On-Demand Gathering: Utilizes all of the available processing capabilities from the source to the sink, so as to apply processing as early as possible. Thereby, reducing the network traffic as much as possible.
- Adaptive Resource Management: Reacts to topology or workload changes without interrupting queries.
NebulaStream Architecture
A modular pipeline that stretches from sensor to cloud, optimising every hop along the way.
- 1 Sources & Sinks: Users can send their data using different source connectors and input formats. Commonly used source connectors include JDBC, MQTT, and TCP, and common input formats include CSV and JSON, which we provide to the user out-of-the-box. In addition, users can add custom connectors or formats. Similarly, users can customize connectors and formatters in the Sink Manager.
- 2 I/O Handling: Unlike other SPEs that handle sources individually and synchronously by assigning one thread per source, NebulaStream interleaves source processing via thread sharing within its own I/O thread pool and applies asynchronous callbacks to reduce waiting time.
- 3 Query Submission: Users can submit queries in either our SQL-like query language. NebulaStream provides many built-in operations, like re-sampling and inference. Moreover, it allows users to specify their own operators.
- 4 Query Optimization: After submission, a query plan is created and optimized before hardware-tailored code is generated. The user can modify the optimizations by providing their own rules to the rule engine.
- 5 Adaptive Runtime: During runtime, the query engine schedules query processing in a highly dynamic manner using task abstractions.
Publications
Project Overview
NebulaStream: An Extensible, High-Performance Streaming Engine for Multi-Modal Edge Applications
System Publications
ExDRa: Exploratory Data Science on Federated Raw Data
Current Researchers
Collaborate with us!
Opportunities
Feel free to reach out to us to learn more about research opportunities as a Postdoc, PhD student, or student assistant. Furthermore, motivated students can also inquire about the possibility of pursing a Bachelor’s or Master’s thesis with us. Our research topics span all aspects of the IoT: query compilation, query optimization, query processing, query languages, distributed data processing, complex-event processing, machine learning, signal processing, sensor networks, fog computing, temporal-spatial query processing, transactional data processing, and modern hardware, among others.
Contact
Database Systems and Information
Management (DIMA) Group Technische Universität Berlin
Sekr. E-N 7, Room E-N 728
Einsteinufer 17
10587 Berlin
Germany
+49 30 314 23555
nebulastream(at)dima.tu-berlin.de
Imprint and Data Privacy
Imprint
The following Privacy Policy refers to the NebulaStream website which falls under
the responsibility of:
TU Berlin, Database Systems and Information Management Group Group, Fak. IV, TU
Berlin.
of the project head Prof. Dr. Volker Markl and project manager
Dr. Steffen Zeuch.
Prof. Dr. Volker Markl
Technische Universität Berlin
Database
Systems and Information Management Group
Electrical Engineering and Computer
Science (Faculty IV)
Einsteinufer 17
10587 Berlin
E-Mail: inquiries
(at)dima.tu-berlin.de
Editorial and Content Responsibility
Dr. Steffen Zeuch
Technische Universität Berlin
Database Systems
and Information Management Group
Electrical Engineering and Computer Science
(Faculty IV)
Einsteinufer 17
10587 Berlin
E-Mail:
steffen.zeuch(at)tu-berlin.de
Legal notices on copyright
The webpage layout, the graphics used, and all other content are protected by copyright.Data Privacy
Thank you for your interest in the nebulastream.com
(ToBeUpdated). The protection of the personal data of visitors of our page is
very important for us. Therefore we want to inform you about data security.
The following Privacy Policy refers to the website which falls under the
responsibility of Steffen Zeuch. The website is concerned by the Berlin Big Data
Center(ToBeUpdated). It is not concerned with commercial transactions or with
the exchange of data for marketing purposes.
Subject of Data Privacy
Data privacy covers personal data. According to art. 4 par. 1 of DSGVO these are data referring to an identified or identifiable individual, hence all data which could be used to identify you. This applies for data such as name, private address, e-mail address, telephone number but also to usage data such as your IP address. Of course the DIMA group observes the legal requirements of data privacy and other applicable regulations. We are committed to ensure that you can trust us concerning your personal data. Therefore transfers of sensitive data are encrypted. In addition our websites are protected against damage and unauthorized access by technical measures.
Data Collection and Storage
For the usage of our website the registration of your personal data is not
necessary in general.
Collection and Storage of Usage Data
For the optimization of our websites we collect and store traffic data to analyze trends in site use and to administer the site (such as visited website, date and time of access, the website which you are coming from and so on) for a period of two weeks. After that they will be deleted automatically.
These web pages are hosted on servers of Github Inc., 88 Colin P Kelly Jr St, San Francisco, CA 94107, USA. Please note the GitHub Privacy Policy, and GitHub Global Privacy Practices, and Privacy Shield Compliance declaration.
Links
This web site contains links to other sites. Please be aware that the our
consortium editorial and content responsible is not responsible for the contents
and the privacy practices of such other sites. We encourage our users to be
aware of that when they leave our site and to read the privacy statements of
these third party sites. This privacy statement applies solely to information
collected by this web site.
Right of Access to Personal Data
You can retrieve information about your stored personal data without giving reasons at any time free of charge. Please contact the address provided below. We will be pleased to assist you if you have any further questions about our data privacy information. Please note that data privacy regulations and handling of data privacy can change from time to time making it necessary to inform oneself about changes of data privacy laws and company policies. This data privacy statement only applies for content of TU Berlin and DIMA webservers which provide this data privacy statement and does not cover linked websites of external webservers.
We invite you to contact us if you have questions about this policy at:Prof. Dr. Volker Markl
Technische
Universität Berlin
Database Systems and Information Management
Group
Electrical Engineering and Computer Science (Faculty
IV)
Einsteinufer 17
10587 Berlin
E-Mail: datenschutz@dima.tu-berlin.de