CARVIEW |
Select Language
HTTP/2 301
server: envoy
x-frame-options: SAMEORIGIN
cache-control: public, s-maxage=86400, max-age=0, must-revalidate
location: https://www.slideshare.net/slideshow/nosql-presentation/3802982
x-envoy-upstream-service-time: 286
p3p: CP="OTI DSP COR CUR ADM DEV PSD IVD CONo OUR IND"
x-content-type-options: nosniff
accept-ranges: bytes
age: 0
date: Sat, 11 Oct 2025 12:49:58 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210061-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1760186998.607908,VS0,VE528
vary: accept-encoding, x-bot
set-cookie: browser_id=411c9991-18b2-4e66-87aa-a393a0060fe5; Domain=.slideshare.net; Path=/; Expires=Thu, 10 Oct 2030 12:49:58 GMT
strict-transport-security: max-age=63072000; includeSubDomains; preload
alt-svc: h3=":443";ma=86400,h3-29=":443";ma=86400,h3-27=":443";ma=86400
content-length: 63
HTTP/2 200
content-type: text/html; charset=utf-8
server: envoy
x-frame-options: SAMEORIGIN
cache-control: public, s-maxage=86400, max-age=0, must-revalidate
x-powered-by: Next.js
etag: "9wgb3nmwdsmcxh"
content-encoding: gzip
x-envoy-upstream-service-time: 776
p3p: CP="OTI DSP COR CUR ADM DEV PSD IVD CONo OUR IND"
x-content-type-options: nosniff
accept-ranges: bytes
age: 0
date: Sat, 11 Oct 2025 12:49:59 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210061-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1760186998.149355,VS0,VE1269
vary: accept-encoding, x-bot
strict-transport-security: max-age=63072000; includeSubDomains; preload
alt-svc: h3=":443";ma=86400,h3-29=":443";ma=86400,h3-27=":443";ma=86400
content-length: 133477
NoSql presentation | PDF 


























































NoSql presentation
The document discusses the evolution of journalism in the digital realm, emphasizing the need for news organizations to adapt and become part of the web rather than merely existing on it. It highlights the use of NoSQL technologies, particularly memcached and Solr, to manage data and improve content delivery amid increasing traffic demands. The strategy includes mutualizing news content and engaging users through APIs and dynamic data rather than relying solely on traditional RDBMS systems.
Downloaded 471 times



























































More Related Content
Viewers also liked
NoSql presentation
- 1. NoSql at guardian.co.uk Matthew Wall Simon Willison
- 3.
- 4.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14. Web server Web server Web server App bring I server you NEWS!!! App server App server Memcached (20Gb) Oracle CMS Data feeds
- 15. Web server Web server Web server Why RDBMS? App bring you NEWS!!! I server App server App server 5 years ago, fewer alternatives Understand operations procedures Memcached Can easily recruit DBAs / devs Developer/ops tools Oracle Business critical system: a safe choice CMS Data feeds
- 20. Related content fromsearch engine
- 21. Related content fromsearch engine Introduction of memcached
- 22. Related content fromsearch engine Big traffic spike Introduction of memcached
- 23. Distributed memcached Protectsdatabase from peak load Entities explicitly decached Queries given TTL memcached = database supercharger
- 24. Now we havea stable “broadcast” platform We know how to scale it SQL running effectively at core We’ve finished, right?
- 25. Digital journalism ischanging We can’t cover everything We can’t compete with everyone Need to be “part of the web” not just “on the web”
- 26.
- 27. Mutalisation of journalism Mutualised news! content No longer only broadcasting User engagement & contribution: journalism data software Data curation / linked data Support engaged developers with data and APIs
- 28. Mutualised news! Be apart of the data fabric of the internet
- 29. Mutualised news! Platform strategy Out: Release our data to the world via APIs In: Rapidly build new functionality outside the core Write: Ingest, store & present arbitrary data
- 30. Mutualised news! Data Out Content API
- 31. Content API Delivered using Apache Solr Mutualised news! Document oriented search engine Loose schema: records, fields, facets Fields can be multi-value Supports dynamic field generation Can apply multiple facets in queries faster than RDBMS
- 32.
- 33.
- 34.
- 35. Mutualised news! Is Solr a database?
- 36. Can perform complexqueries, including full text search Mutualised news! Can filter results with facets (WHERE clause) ANYTHING can be a facet.Very powerful. On our dataset most queries are of a similar cost Scales very well horizontally Handles millions of documents
- 37. Mutualised news! No transactions Excellent for certain types of queries Not truly general purpose Schema design very important Search index not really persistence
- 38. Core Api Web servers Solr App server Solr Memcached (20Gb) Solr rdbms Solr Solr M/Q Solr CMS Cloud, EC2
- 39. API Mutualised news! Currently powering iPad app Site components External applications Editors tools More to follow
- 40. Mutualised news! Data In Application framework
- 41. Application framework Simple REST/ HTTP news! allows lightweight Mutualised framework development Applications proxied for performance Apps generally hosted in the cloud, hot deployment into production No RDBMs provided for storage Can develop in news timeline
- 42. Core Apps Web servers App Proxy App server App Memcached (20Gb) App App rdbms App M/Q App CMS external hosting app engine etc
- 43.
- 44. Some useful characteristics • Scale down as well as up • Support rapid production-ready prototyping: turn projects around in hours or days • Handle massive traffic spikes
- 45. Desktop analysis • LeakedBNP membership list • Load postcodes to constituencies mapping in to Redis • Generate heatmaps by looking up all 12,000 postcodes
- 46.
- 47. MP’s expenses SELECT * FROM pages WHERE is_reviewed = 0 ORDER BY RAND()
- 48.
- 49. v2 used Redis Set differ l a b ou r M ence: P pages - reviewed p a ge s MEM BER SRA ND
- 50.
- 51. Zeitgeist stores pre- calculatedresults in BigTable • Data comes in from stats system, comments system and OneRiot real-time search API • AppEngine cron tasks populate task queues • Task queues recalculate hotness levels • “Live” BigTable queries are simple SELECT / SORT
- 52. Live debate poll •Over a million votes cast in an hour • Stretched limits of BigTable / AppEngine • Sharded counter pattern to handle writes
- 53. Spreadsheets are NoSQL too...
- 54. Google Docs powered infographics
- 55.
- 56. • Datablog waslaunched with no development involvement at all - it’s a blog, and a bunch of Google Docs Spreadsheets • Retrieve data as CSV, XLS, JSON, Atom... • “Make a copy” and run your own analysis
- 57. Mutualised news! Write Arbitrary data
- 58. Mutualised news! Create schemafree database alongside RDBMS Index in Solr Provide access in API Investigating: CouchDB
- 59. Core Out In Web servers App Solr Proxy App server App Solr Memcached (20Gb) App Solr App CMS Data feeds Solr Solr App M/Q Solr App rdbms CouchDB? external hosting Cloud, EC2 app engine etc