HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 server: nginx date: Mon, 29 Dec 2025 14:26:01 GMT content-type: text/html; charset=utf-8 a8c-edge-cache: cache x-redirect-by: redirection location: /data-pipeline/data-pipeline-getting-started/ a8c-internal-atomic-site-id: 151261726 a8c-internal-atomic-client-id: 35 a8c-internal-wp-username: - a8c-internal-request-renderer: php a8c-internal-request-cached: false a8c-internal-php-burst-status: - a8c-internal-php-backlog-time: 0.000 x-rq: bom1 x-cache: MISS strict-transport-security: max-age=31536000 HTTP/2 200 server: nginx date: Mon, 29 Dec 2025 14:26:02 GMT content-type: text/html; charset=UTF-8 a8c-edge-cache: cache x-hacker: If you're reading this, you should visit https://join.a8c.com/viphacker and apply to join the fun, mention this header. x-powered-by: WordPress VIP host-header: a9130478a60e5f9135f765b23f26593b link: ; rel="https://api.w.org/" link: ; rel="alternate"; title="JSON"; type="application/json" link: ; rel=shortlink a8c-internal-atomic-site-id: 151261726 a8c-internal-atomic-client-id: 35 a8c-internal-wp-username: - a8c-internal-request-renderer: php a8c-internal-request-cached: false a8c-internal-php-burst-status: - a8c-internal-php-backlog-time: 0.000 content-encoding: gzip x-rq: bom1 accept-ranges: bytes x-cache: MISS cache-control: max-age=300, must-revalidate strict-transport-security: max-age=31536000 Getting started with DPL data | Parse.ly Skip to content

Documentation

User handbook
Site settings
User options
Installation resources
API
- Overview
- Use case
- Endpoints
  - Analytics
  - Conversions
  - Exclude
  - Metadata
  - Partners
  - Recommendations
  - Referrers
  - Search
  - Shares
  - Top
- Available parameters
- Custom metadata
Data Pipeline
Data, privacy, and security

Product & Support Updates
Contact Parse.ly Support
Operational status
Parse.ly blog
WordPress VIP Documentation

Data Pipeline
Getting started with DPL data

Getting started with DPL data

This guide explains how to access Parse.ly Data Pipeline data, including the publicly available demo data and customer-specific Data Pipeline data.

Watch the Data Pipeline getting started video

Download Data Pipeline data using AWS CLI

Setting up AWS CLI locally is simple. Follow the AWS CLI installation instructions.

Set up credentials to access a private Data Pipeline S3 bucket. This step is not necessary for the demo data public S3 bucket.

aws configure --profile parsely_dpl
AWS Access Key ID [None]: ENTER ACCESS ID
AWS Secret Access Key [None]: ENTER SECRET KEY
Default region name [None]: us-east-1
Default output format [None]: json

Download one file

Download files from a Parse.ly Data Pipeline S3 bucket or from the public Parse.ly S3 bucket with demo Data Pipeline data.

For the demo Parse.ly Data Pipeline data

aws --no-sign-request s3 cp s3://parsely-dw-parse-ly-demo events/file_name.gz

For a private customer-specific S3 bucket Make sure to use the profile flag

aws s3 cp s3://parsely-dw-bucket-name-here events/file_name.gz --profile parsely_dpl

Download all files

Download all files in an S3 bucket using the commands below. This may involve a large amount of data.

For the demo Parse.ly Data Pipeline data

aws --no-sign-request s3 cp s3://parsely-dw-parse-ly-demo . --recursive

For a private customer-specific S3 bucket Make sure to use the profile flag

aws s3 cp s3://parsely-dw-bucket-name-here . --recursive --profile parsely_dpl

Copy the data to an S3 bucket

AWS CLI provides a simple method to copy data to an S3 bucket locally.

For the demo Parse.ly Data Pipeline data

aws s3 --no-sign-request cp s3://parsely-dw-parse-ly-demo s3://your-bucket-here --recursive

For a private customer-specific S3 bucket Make sure to use the profile flag

aws s3 cp s3://parsely-dw-bucket-name-here s3://your-bucket-here --recursive --profile parsely_dpl

Copy Data Pipeline data to Redshift or Google BigQuery

Parse.ly provides a GitHub repository for these use cases.

The README.md in the repository linked above contains detailed installation and usage instructions. The following examples demonstrate common tasks after installing the parsely_raw_data repository.

Copy S3 data to a Redshift database

This command creates an Amazon Redshift table using the specified Parse.ly schema and loads the Data Pipeline data into the new table.

python -m parsely_raw_data.redshift

Copy S3 data to Google BigQuery

This command creates a Google BigQuery table using the specified Parse.ly schema and loads the Data Pipeline data into the new table.

python -m parsely_raw_data.bigquery

Query Data Pipeline data using AWS Athena

AWS Athena provides a SQL interface to query S3 files directly without moving data.

Create an Athena table using the Parse.ly Data Pipeline Athena schema

Load the data into the recommended year-month partitions:

ALTER TABLE table_name_here ADD PARTITION (year='YYYY', month='MM') location 's3://parsely-dw-bucket-name-here/events/YYYY/MM'

Use Athena to query the Data Pipeline data

Getting started queries to answer common questions

These queries are formatted for use with Athena to query the Data Pipeline data.

Retrieve all records

This query retrieves all records from the Athena table that reads from S3 files. This query retrieves only loaded partitions (see section above). More specific partitions reduce Athena query costs.

select * from parsely_data_pipeline_table_name

Bot traffic investigation

Bot traffic continues to evolve. Investigate the user agent and IP address for a specific post on a certain day using the following query as a template.

select
  user_agent,
  visitor_ip,
  count(action) as pageviews
from parsely_data_pipeline_table_name
where
  year = 'yyyy' and --this makes the query cheaper!
  month = 'mm' and --this makes the query cheaper!
  action = 'pageview' and
  url like '%only-include-unique-url-path-here%' and
  date(ts_action) = 'yyyy-mm-dd'

Engaged-time by referrer type

This is a template query to retrieve engaged time by referrer category.

select
  channel,
  ref_category,
  sum(engaged_time_inc) as engaged_time_seconds,
  sum(engaged_time_inc)/60 as engaged_time_minutes
from parsely_data_pipeline_table_name
where
  year = 'yyyy' and
  month = 'mm'
group by 1,2
order by 3 desc

View conversions

Conversions data is included in Data Pipeline data. Query conversions data using the following template.

select
  *
from parsely_data_pipeline_table_name
where
  year = 'yyyy' and --this makes the query cheaper!
  month = 'mm' and --this makes the query cheaper!
  action = 'conversion'

Use dbt and a pre-formatted star schema to organize Data Pipeline data in Redshift

The dbt (data build tool) automates SQL table creation and data pipeline management for Parse.ly data. It generates queryable tables for page views, sessions, loyalty users, subscribers, engagement levels, and read time. The tool handles incremental loading of new data from S3 to SQL tables, reducing configuration time and enabling faster custom query development.

More information is available in the Parse.ly dbt Redshift repository.

How to get started

Install dbt and requirements from the main /dbt/ folder one level up: pip install -r requirements.txt
Edit the following files:
- ~/.dbt/profiles.yml: Input profile, Redshift cluster, and database information. Refer to the dbt profile configuration documentation.
- settings/default.py: This is the one stop shop for all parameters that need to be configured.
Test the configuration by running python -m redshift_etl. A fully updated settings/default.py file requires no additional parameters. Arguments provided at runtime override settings in default.py.
Schedule redshift_etl.py to run on an automated schedule. Daily runs are recommended.

Schemas/models

Users Table Grain: One row per unique user ID based on IP address and cookie. This table provides Parse.ly Data Pipeline lifetime engagement data for each user, including loyalty and rolling 30-day loyalty classification.
Sessions Table Grain: One row per user session. A session represents any activity by one user without being idle for more than 30 minutes. The session table includes total engagement and page view metrics for the entire session, as well as the user types at the time of the session. This enables simplified identification of conversions into loyalty users and subscribers.
Content Table Grain: One row per article or video. This table contains only the most recent metadata for each article or video and enables simplified reporting and aggregation when metadata changes throughout the article’s lifetime.
Campaigns Table Grain: One row per campaign. This table contains only the most recent description for each campaign.
Pageviews Table Grain: One row per page view. This table contains the referrer, campaign, timestamps, engaged time, and at-time-of-engagement metadata for each page view. The page views are organized to show the order and flow of page views within a session for a single user.
Videoviews Table Grain: One row per videoview. This table contains the referrer, campaign, timestamps, engaged time, and at-time-of-engagement metadata for each video view. The video views are organized to show the order and flow of video views within a session for a single user.
Custom events Table Grain: One row per custom event sent through the Parse.ly Data Pipeline. This is any event that is not: pageview, heartbeat, videostart, or vheartbeat. These can be specified in the dbt_project.yml file and contain keys to join to users, sessions, content, and campaigns.

Last updated: December 24, 2025

Contact Parse.ly

Have a question, or ready to get started with Parse.ly?

Contact Support
Try a Free Demo

Documentation is licensed under a

Creative Commons Attribution-ShareAlike 4.0 International License

An Automattic Creation

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source