Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: Apache last-modified: Fri, 12 Dec 2025 03:14:55 GMT etag: "9187-645b8aa6147cf-gzip" content-encoding: gzip access-control-allow-origin: * content-security-policy: default-src 'self' data: blob: 'unsafe-inline' 'unsafe-eval' https://www.apachecon.com/ https://www.communityovercode.org/ https://*.apache.org/ https://apache.org/ https://*.scarf.sh/ https://*.algolia.net/ https://*.algolianet.com/ https://*.algolia.io/; script-src 'self' data: blob: 'unsafe-inline' 'unsafe-eval' https://www.apachecon.com/ https://www.communityovercode.org/ https://*.apache.org/ https://apache.org/ https://*.scarf.sh/ https://*.algolia.net/ https://*.algolianet.com/ https://*.algolia.io/; style-src 'self' data: blob: 'unsafe-inline' 'unsafe-eval' https://www.apachecon.com/ https://www.communityovercode.org/ https://*.apache.org/ https://apache.org/ https://*.scarf.sh/ https://*.algolia.net/ https://*.algolianet.com/ https://*.algolia.io/; frame-ancestors 'self'; frame-src 'self' data: blob: 'unsafe-inline' 'unsafe-eval' https://www.apachecon.com/ https://www.communityovercode.org/ https://*.apache.org/ https://apache.org/ https://*.scarf.sh/ https://*.algolia.net/ https://*.algolianet.com/ https://*.algolia.io/; worker-src 'self' data: blob:; content-type: text/html via: 1.1 varnish, 1.1 varnish accept-ranges: bytes age: 10104 date: Thu, 25 Dec 2025 20:32:35 GMT x-served-by: cache-hel1410020-HEL, cache-bom-vanm7210085-BOM x-cache: HIT, HIT x-cache-hits: 14, 0 x-timer: S1766694756.969458,VS0,VE1 vary: Accept-Encoding strict-transport-security: max-age=31536000; includeSubDomains; preload content-length: 6624 Apache Spark™ - Unified Engine for large-scale data analytics

Simple.
Fast.
Scalable.
Unified.

Key features

Batch/streaming data

Unify the processing of your data in batches and real-time streaming, using your preferred language: Python, SQL, Scala, Java or R.

SQL analytics

Execute fast, distributed ANSI SQL queries for dashboarding and ad-hoc reporting. Runs faster than most data warehouses.

Data science at scale

Perform Exploratory Data Analysis (EDA) on petabyte-scale data without having to resort to downsampling

Machine learning

Train machine learning algorithms on a laptop and use the same code to scale to fault-tolerant clusters of thousands of machines.

Run now

Install with 'pip'

$ pip install pyspark

$ pyspark

Use the official Docker image

$ docker run -it --rm spark:python3 /opt/spark/bin/pyspark

df = spark.read.json("logs.json")
df.where("age > 21").select("name.first").show()

# Every record contains a label and feature vector
df = spark.createDataFrame(data, ["label", "features"])
# Split the data into train/test datasets
train_df, test_df = df.randomSplit([.80, .20], seed=42)
# Set hyperparameters for the algorithm
rf = RandomForestRegressor(numTrees=100)
# Fit the model to the training data
model = rf.fit(train_df)
# Generate predictions on the test dataset.
model.transform(test_df).show()

df = spark.read.csv("accounts.csv", header=True)
# Select subset of features and filter for balance > 0
filtered_df = df.select("AccountBalance", "CountOfDependents").filter("AccountBalance > 0")
# Generate summary statistics
filtered_df.summary().show()

Run now

$ docker run -it --rm spark /opt/spark/bin/spark-sql

spark-sql>

SELECT
  name.first AS first_name,
  name.last AS last_name,
  age
FROM json.`logs.json`
  WHERE age > 21;

Run now

$ docker run -it --rm spark /opt/spark/bin/spark-shell

scala>

val df = spark.read.json("logs.json")
df.where("age > 21")
  .select("name.first").show()

Run now

$ docker run -it --rm spark /opt/spark/bin/spark-shell

scala>

Dataset df = spark.read().json("logs.json");
df.where("age > 21")
  .select("name.first").show();

Run now

$ docker run -it --rm spark:r /opt/spark/bin/sparkR

df <- read.json(path = "logs.json")
df <- filter(df, df$age > 21)
head(select(df, df$name.first))

The most widely-used engine for scalable computing

Thousands of companies, including 80% of the Fortune 500, use Apache Spark™.
Over 2,000 contributors to the open source project from industry and academia.

Ecosystem

Apache Spark™ integrates with your favorite frameworks, helping to scale them to thousands of machines.

Data science and Machine learning