CARVIEW |
Select Language
HTTP/2 200
date: Thu, 24 Jul 2025 16:00:36 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
etag: W/"75b062373e9bda1cd822456f88dfd0d5"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: no-referrer-when-downgrade
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
set-cookie: _gh_sess=Wzs2DHein7v10wpMZA0mvdgnYUaKsgwoutxWMXiuw%2FxKlqzSHGGC9htI893BKs9vAocf2dY%2Fx%2BJQRrXRqtUUTx9wi8PcRpnfRlAXFAasStpuEwrmzuO%2F%2FdZE09ggxLl5LG7aPs4ATz3Xgnqr58677MQs7NXf8WwVm7CN2NHCvUIh0KUUMb6GTdN%2BW5rgavsSiPGRAAncWKmf%2BeInrrJn8WP8luNAHQf%2BBQzBorgZB57LK5GId4KVBnVPYFjXj2iNfvzXJ8mH9jMnK6Z3j9RfnQ%3D%3D--gco0VENp%2BY8swuoU--E3SRzki15AjhojzfzLDY4w%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.801844078.1753372835; Path=/; Domain=github.com; Expires=Fri, 24 Jul 2026 16:00:35 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Fri, 24 Jul 2026 16:00:35 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: 9C54:1B4B27:F5556:11E49C:688258A3
Home · twitter/scalding Wiki · GitHub
Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 708
Home
P. Oscar Boykin edited this page Jul 19, 2016
·
94 revisions
Scalding is a Scala library that makes it easy to write MapReduce jobs in Hadoop. It's similar to other MapReduce platforms like Pig and Hive, but offers a higher level of abstraction by leveraging the full power of Scala and the JVM.
Scalding is built on top of Cascading, a Java library that abstracts away much of the complexity of Hadoop (such as the need to write raw map
and reduce
functions).
Need a suggestion for where to start? Try the Alice in Wonderland walkthrough which shows how to use Scalding step by step to learn about the book's text.
- Scaladocs: Generated documentation for current version of Scalding.
- Note:
sbt doc
will build scaladocs under thetarget/2.9.2/api/
directory, which you can then open in your browser. - Tutorials
- Beginner
- Getting Started
- Scalding REPL: Learning is better when it's interactive. This tutorial shows off how to interact with your data using the Scalding REPL.
- Alice in Wonderland walkthrough: Step-by-step example of using Scalding in Local mode in the REPL.
- Intro to Scalding Jobs
- Intermediate
- Aggregation using Algebird Aggregators. Continuing the SQL analogy, we see how to use composable Aggregators.
- SQL to Scalding. Canonical ways of translating common SQL idioms to Scalding.
- Advanced
- Building Bigger Platforms With Scalding some approaches for modular design and composing with scalding.
- Getting Started with the Matrix library
- Beginner
- Reference/Other
- Type-safe API Reference. This API is very close to the scala collections API.
- REPL Reference
- Automatic Orderings, Monoids and Arbitraries: using macros to automatically generate needed Ordering, Moniod, Semigroup or Arbitrary instances for case classes and scala collections.
- Matrix-API-Reference
- Scalding Sources
- Scalding-Commons. The README of the former scalding-commons library.
- Rosetta Code. A collection of MapReduce tasks translated (from Pig, Hive, Cascalog, MapReduce Streaming, etc.) into Scalding.
- Oscar's Scalding Talk at the Hadoop Summit. Slides from Oscar's talk at the Hadoop Summit.
- Upgrading to 0.9.0 means fixing some compile issues. These sed rules may help.
- DEPRECATED: Fields-based API Reference. This is the original, Cascading DSL API to scalding using a named tuple model. We highly recommend the Type-safe API, using TypedPipe, for any new code. This page also contains many example code snippets illustrating each Scalding function. See Field Rules for more on Fields.
- Scalding-cassandra support for reading/writing cassandra
- [Spy Glass] (https://github.com/ParallelAI/SpyGlass) - Advanced featured HBase wrapper for Cascading and Scalding
- Scalding: Powerful & Concise MapReduce Programming
- Scalding lecture for UC Berkeley's Analyzing Big Data with Twitter class
- Scalding with CDH3U2 in a Maven project
- Running your Scalding jobs in Eclipse
- Run/Test jobs locally from Intellij IDEA
- Running your Scalding jobs in IDEA intellij
- Running Scalding jobs on EMR
- Running Scalding with HBase support: Scalding HBase wiki
- Using the distributed cache
- Calling Scalding from inside your application
- Unit Testing Scalding Jobs
- Using counters
NOTE: all of the following tutorials use the Fields API, which is deprecated
- Scalding for the impatient great set of tutorials on using scalding walking through simple to more complex examples (including TF-IDF).
- Movie Recommendations and more in MapReduce and Scalding
- Generating Recommendations with MapReduce and Scalding, a shorter version of the above post.
- Poker collusion detection with Mahout and Scalding
- Portfolio Management in Scalding
- Find the Fastest Growing County in US, 1969-2011, using Scalding
- Dean Wampler's Scalding Workshop. Presented by Dean at StrangeLoop 2012.
- Typesafe's Activator for Scalding. Also created by Dean Wampler.
- Hive, Pig, Scalding, Scoobi, Scrunch and Spark: A Comparison of Hadoop Frameworks
- Why Hadoop MapReduce needs Scala
- How Twitter is doing its part to democratize big data
- Meet the combo powering Hadoop at Etsy, Airbnb and Climate Corp.
- Scalding wins a Bossie award from InfoWorld
- Scalding: Hadoop Word Count in LESS than 70 lines of code
- Using Scalding with other versions of Scala
- Scala and sbt for Homebrew users
- Scala and sbt for MacPorts users
- Comparison to Scrunch and Scoobi
- Powered-By see who is using scalding in production.
- Scaladocs
- Getting Started
- Type-safe API Reference
- SQL to Scalding
- Building Bigger Platforms With Scalding
- Scalding Sources
- Scalding-Commons
- Rosetta Code
- Fields-based API Reference (deprecated)
- Scalding: Powerful & Concise MapReduce Programming
- Scalding lecture for UC Berkeley's Analyzing Big Data with Twitter class
- Scalding REPL with Eclipse Scala Worksheets
- Scalding with CDH3U2 in a Maven project
- Running your Scalding jobs in Eclipse
- Running your Scalding jobs in IDEA intellij
- Running Scalding jobs on EMR
- Running Scalding with HBase support: Scalding HBase wiki
- Using the distributed cache
- Unit Testing Scalding Jobs
- TDD for Scalding
- Using counters
- Scalding for the impatient
- Movie Recommendations and more in MapReduce and Scalding
- Generating Recommendations with MapReduce and Scalding
- Poker collusion detection with Mahout and Scalding
- Portfolio Management in Scalding
- Find the Fastest Growing County in US, 1969-2011, using Scalding
- Mod-4 matrix arithmetic with Scalding and Algebird
- Dean Wampler's Scalding Workshop
- Typesafe's Activator for Scalding
Clone this wiki locally
You can’t perform that action at this time.