CARVIEW |
Excerpting Best Practices Hinge on Intent
Mac Slocum
March 2, 2009
| Permalink
| Comments (2)
|
Listen
A piece in the New York Times reignites the fair use debate by asking: How much excerpting does fair use cover?
It's a reasonable question, particularly since Google News, the Huffington Post and countless other sites rely on excerpt aggregation to drive traffic and sell ads. But the rules of excerpting are also -- to steal a line from Steve Jobs -- "a bag of hurt."
Fair use is a doctrine, and as much as editors, bloggers and other with an excerpting bent wish for structure (word count, percentage used, image size, etc.), it's not going to happen. Fair use is contextual and case-by-case. That's why Henry Blodget, co-founder of Silicon Alley Insider, has the right perspective:
"To excerpt others the way we want to be excerpted ourselves."
Intent is the key to proper excerpting. If your intent is to single out someone else's work, and drive attention and its associated benefits and detriments to the creator of that work, then excerpts will be short and filled with outbound links. But if your intent is to fool Google, boost your traffic, and use someone else's material to further your own efforts, then excerpts will be long and link-free -- or they'll contain links to your material.
Excerpting is an extension of white-hat vs. black-hat search engine optimization. The white hats understand that search engines are the essential utility on the Web. Gaming them for personal gain erodes value and reduces opportunities for everyone. Black hats care only about short-term efforts, so they do anything they can to turn attention into quick advertising revenue. What black hats don't realize -- or care about -- is the impact their actions have on the structure of the Internet. They're jackhammering the foundation they're standing on.
Sites that push the boundaries of excerpting are engaged in the same self-destructive behavior. They may see short-term traffic and revenue spikes, but the source sites will eventually cry foul and enact their own Draconian countermeasures. Long-term, this doesn't benefit anyone. Sites that rely on excerpted information will lose access, and originating sources will lose attention. To be effective, excerpting needs to be a mutually beneficial relationship that provides value to everyone involved. The only "rule" is intent.
Related Stories:
Hearst Gets Into the E-Reader Game
Mac Slocum
February 27, 2009
| Permalink
| Comments (0)
|
Listen
Hearst Corp. is developing its own wireless e-reader that may debut this year. From Fortune:
According to industry insiders, Hearst, which publishes magazines ranging from Cosmopolitan to Esquire and newspapers including the financially imperiled San Francisco Chronicle, has developed a wireless e-reader with a large-format screen suited to the reading and advertising requirements of newspapers and magazines. The device and underlying technology, which other publishers will be allowed to adapt, is likely to debut this year.
The larger screen size will put the Hearst reader in the same class as devices from Plastic Logic and iRex.
Fortune says Hearst isn't discussing product specs, but the company has a longtime association with E Ink. Last September, Esquire published the first E Ink magazine cover.
Related Stories:
TOC Twitter Visualization Contest Winner
Andrew Savikas
February 26, 2009
| Permalink
| Comments (0)
|
Listen
The winner of our impromptu contest for best visualization of the TOC Conference Twitter activity is Stephen Smith for his tag clouds and stats over at toctweet.com:

Congrats to Steve, who gets a free full pass to TOC 2010! (With an honorable mention to @thewritermama for banging out 720(!) tweets during the show.)
Indigo's Shortcovers Launched Today: A Good Start, But Room for Reader Improvement
Andrew Savikas
February 26, 2009
| Permalink
| Comments (4)
|
Listen
The Shortcovers website and companion iPhone and Blackberry apps launched today (we posted a sneak preview back in January). Put simply, it's a website for buying ebooks. But there's a few interesting twists that (for now) set it apart.
Though most of the current content is books, the primary unit of the service is the "shortcover" -- things like an article, a blog post, and a book chapter. That means publishers have the option of making individual chapters available for sale (or as free samples). But perhaps the more interesting consequence of that is something they're calling "mixes," where readers can combine multiple shortcovers into a single "mix" (think iTunes playlist), and share that with other readers. Though my search was admittedly brief, I wasn't able to find any for-pay content available for inclusion in a mix.
They also definitely understand the social aspect of reading. Beyond the mixes, readers can also upload their own content, rate content, and share content (via Twitter or email).
On the downside, right now although some content is downloaded locally to the iPhone, most of the service only really works when you're online. Also, the navigation within books isn't very intuitive, and the interfaced doesn't drop away while reading (the navigation and settings bars at the top and bottom remain on screen while reading).
And (sadly unsurprisingly), the reader appears to have trouble displaying complex content like lists and tables, and computer code (the ones I looked at either didn't display the code at all, or displayed it in regular variable-width font). I've sent a note to the Shortcovers folks to try and learn more, but I'm continually surprised with how poorly many of these reading systems (including the Kindle, until very recently) have handled kinds of content that have been part of standard HTML for well over a decade. Here's some screenshots of the problem:
I'd be more sympathetic if the iPhone SDK didn't already include the WebKit framework for rendering HTML. Sigh.
But overall it's a decent start, and an impressive first real entry into the mobile reading space from an existing print retailer.
Several more iPhone screenshots are below:
Hallway Video from TOC Conference: Tim O'Reilly on Open Publishing
Andrew Savikas
February 26, 2009
| Permalink
| Comments (0)
|
Listen
The folks from the RIT Open Publishing Lab have posted a short video talking with Tim O'Reilly in the hallway of the TOC Conference about Open Publishing:
Taxonomies and Starting With XML
Laura Dawson
February 25, 2009
| Permalink
| Comments (9)
|
Listen
This is an excerpt from a blog post I wrote last week on taxonomies and chunking.
Last October, the StartWithXML team wrote a post called "To Chunk or Not To Chunk," where we discussed tagging and infrastructure issues, and a discussion ensued about what happens when you don't know what you'll be using chunks for. How do you tag those?
Later, in our StartwithXML One-Day Forum, we included a presentation on tagging and chunking best practices, where it was pointed out that no taxonomy for chunk-level content currently exists.
We have taxonomies for book-level content. These include formalized code sets such as theLibrary of Congress subject codes, the BISAC codes, the Dewey Decimal System, among others. There are also informal code sets, like the tag sets on Shelfari or Library Thing. There are proprietary taxonomies at Amazon and B&N.com that enable effective browsing.
But nothing like this exists for sub-book-level content. It's never been traded before. We've never really needed a taxonomy for it before.
Other industries that traditionally distribute "chunks" have their own taxonomies that might prove useful in building a book-chunk schema. These include the IPTC news codes, which identify the content of a particular news story -- that's the closest analogy I can find for small gobbets of content that require organization.
Industries have proprietary taxonomies to identify certain concepts -- culinary arts, music, agriculture, engineering, the sciences, literature and criticism, education, and on and on and on. But these do not necessarily identify concepts within a book.
Some might argue that we don't necessarily need taxonomies -- why can't we use natural-language search and the semantic Web to "bubble up" the "right" concepts? I'd argue that words don't always mean what we think they mean. A classic example from my library days is the term "mercury." That could mean the planet, the car or the element. Proponents of semantic search would say that the context in which "mercury" is mentioned should take care of defining that term. I'd say that's true in about 50 percent of all cases but not definitively true enough in 75-100%.
My original post gets into more detail about why taxonomies are important search tools, and how the digitization of books requires a good taxonomy ... and who should do it.
Related Stories:
Expectation of Fair Pricing, Not Free
Peter Brantley
February 23, 2009
| Permalink
| Comments (6)
|
Listen
At Dear Author, a post stating that not all content should be expected to be free; rather it must be provided, free or not, in a realistic understanding of consumer needs and expectations, which might mean changing the way you do business.
What content providers must realize is that a changing business model wherein revenues are no longer captured in the same way does not mean that content is not without value or that people will not pay, in some way, to use that content. I think many people recognize that in order to have worthwhile content, we must pay in some way for it. Consumers have reduced the value of the album, but have not determined that music itself is without value. Consumers might believe that digital books have reduced cost given the costs of production, distribution and warehousing; but it is not our belief that books are without value altogether or that all books must be provided for free. I think what consumers are looking for is a fair trade. Content creators provide the best content they possibly can and for a fair price allow the consumers to utilize it in the way that it fits into their lives.
Related Stories:
Virginia Open Sourcing Physics Textbook ("Flexbook")
Andrew Savikas
February 18, 2009
| Permalink
| Comments (5)
|
Listen
I was part of a brief Twitter exchange recently with Cengage's Ken Brooks about the cost of textbooks:
kenbrooks: @doctorow #toc That depends entirely on the type of book. A K-12 reading program costs $millions.
andrewsavikas: @kenbrooks not necessarily. See ck12.org
kenbrooks: @andrewsavikas Talk to McGraw Hill or Pearson about basal reading programs. The intricacies are staggering. #toc
I like Ken a lot personally (and respect him a ton professionally), and I have no reason to doubt that it does take millions to develop many educational programs. But my reference to ck12.org (whose founder, Neeru Khosla, keynoted at TOC 2008) was because if it does cost that much, then something's wrong with the system, and that's not likely to change without the work of groups like ck12.
In fact, Virgina is already in the process of developing an open-source "flexbook" for physics using the ck12 platform:
Secretary of Technology Aneesh Chopra and Secretary of Education Tom Morris today announced the selection of thirteen individuals to form a core team to pilot the development and release of an open–source physics "flexbook" for Virginia. This electronic material will focus on high school physics and contain contemporary and emerging 21st century physics and modern laboratory experiments.
The Virginia Physics "Flexbook" project is a collaborative effort of the Secretaries of Education and Technology and the Department of Education that seeks to elevate the quality of physics instruction across the Commonwealth by allowing educators to create and compile supplemental materials relating to 21st century physics in an open–source format that can be used to strengthen physics content. The Commonwealth is partnering with the Palo Alto, California–based non–profit, CK–12 on this initiative as they will provide the free, open–source technology platform to facilitate the publication of the newly developed content as a "flexbook" — defined simply as an adaptive, web–based set of instructional materials.
"We need transformational ideas to ensure all Virginians are educated to compete in an increasingly competitive global economy," said Secretary Chopra. "This pilot initiative is a step in the right direction to introduce our students to contemporary physics topics and lab materials at no additional cost to the taxpayers or students," added Secretary Morris.
There is certainly a place for the investment-intensive educational publishing programs that only a firm with the resources of Cengage or Pearson or McGraw-Hill can provide. But there's also enormous opportunity to try new models that take advantage of the kind of collaboration that underpins all of academia to develop and distribute quality learning material for students at lower costs. (BTW, ck12 is hiring.)
OMG. Best. TOC. Wrapup. Ever.
Andrew Savikas
February 17, 2009
| Permalink
| Comments (1)
|
Listen
Every single thing on Kat Meyer's Tiger-Beat-style cover from her TOC wrapup cracked me up:
I think we may have a new cover design for our printed program for TOC 2010. Well done, Kat.
Full Text of Jason Epstein's TOC 2009 Keynote
Andrew Savikas
February 17, 2009
| Permalink
| Comments (7)
|
Listen
Few can claim the depth of experience with publishing that Jason Epstein brought to the stage at the TOC Conference. Among my favorite moments of the conference this year was the chance during a break to hear Jason talk with Tim O'Reilly about their respective views on the past and future of publishing.
Several attendees asked for the full text of Jason's keynote, and he was kind enough to oblige:
Speech given by Jason Epstein at the 2009 O'Reilly Tools Of Change for Publishing Conference
I don't have to tell anyone here that we are at the end of the Gutenberg era; at the threshold not only of a new way of publishing books but of a cultural revolution orders of magnitude greater than Gutenberg's, assuming we survive our financial calamity, our 20,000 nuclear weapons, and our melting ice cap, all of them by the way unintended consequences of the western civilization that Gutenberg's technology made possible.
Five centuries ago Gutenberg's dream was to print a uniform prayer book on his new press to be distributed to all the churches of Europe and in this way unify the catholic faith which was fractured by schisms, especially in Germany where Gutenberg made his living selling trinkets at religious fairs. Instead, to what would have been Gutenberg's dismay had he lived to see it, the printing press spawned our modern world with all its wonders and woes -- the Protestant Reformation, the Enlightenment and for better and worse, our skeptical, secular, experimental civilization. Whoever believes that books are simply another form of entertainment has missed the point.
Read more…Are Ebook Device Makers Missing the Market?
Andrew Savikas
February 16, 2009
| Permalink
| Comments (1)
|
Listen
Over on Dear Author, Jane Litte suggests current ebook device marketers aren't effectively targeting what is likely the most influential segment of their market -- women:
The idea is to get women thinking that the vehicle fits into their lives, rather than the woman fitting her life around the vehicle. The most recent Kindle 2.0 ad shows a business man leaning up against the post reading a Kindle and a woman on the beach reading her Kindle, all alone. Seriously? What woman has frequent escapes to the beach where she is alone!
...
Ads need to show women reading on the bus, train, subway. Ads should show a woman leaning against a post waiting for a ride or in her SUV waiting to pick up the kids from practice or in the lunch line or grocery store line or waiting at the post office or in the doctor's waiting room. The point of the ads should be that the device is there whereever a woman is, whenever a woman wants it. It should not point out that the only time you can read an ebook is when you are alone and in the park.
Lot of great stuff -- the full post is well worth a read (and props to the Dear Author folks for a killer iPhone version of their blog).
Links to All Articles/Posts from Best of TOC eBook
Andrew Savikas
February 15, 2009
| Permalink
|
Listen
Some of you interested in the "Best of TOC" ebook have objected to having to go through the O'Reilly shopping cart process to get the free ebook. Point taken, and thank you for the feedback. Other readers are looking for a place to comment on the pieces; because these were all published blog posts, many already have rich comment threads of conversation. To address both concerns, here's a full linked list of all the pieces we included in the Best of TOC ebook:
- Digital Rights Management Versus Enforcement
- Amazon Ups the Ante on Platform Lock-In
- Ebook Format Primer
- Ergonomics and Ebook Success
- Responsibly Assuaging Author Concerns About File Sharing and “Piracy”
- It’s Time to Accept an Ambiguous Digital Fate
- Storytelling 2.0: Alternate Reality Games
- Content Owners and Consumers Need Digital Quid Pro Quo
- The Pitfalls of Publishing’s E-Reader Guessing Game
- Treating Ebooks Like Software
- On Publishers and Software Development
- Ebooks and Print Books Are Not Mutually Exclusive
- POD Opens Door to Magazine Experiments and Customization
- Web Community Management Tips
- Reinventing the Book and Killing It are Separate Things
- Q&A with Developer Who Turns Ebooks into iPhone Applications
- Terry Goodkind Follows The Money
- Web Analytics Primer for Publishers
- A Unified Field Theory of Publishing in the Networked Era
- How Many Publishing CEOs Know What an API Is?
- Why You Should Care About XML
- Publisher as Brand?
- Regulating the Google Settlement
- Point-Counterpoint: On Digital Book DRM
- Point-Counterpoint: Digital Book DRM, the Least Worst Solution
- Interstitial Publishing: A New Market from Wasted Time
- The Once and Future Ebook: On Reading in the Digital Age
According to our ecommerce data, several hundred of you have "purchased" the free ebook. I'm thrilled there's so much interest -- this is definitely something we'll be looking to do again with this and other conferences.
Text and XML of All #TOC 2009 Tweets
Andrew Savikas
February 13, 2009
| Permalink
| Comments (4)
|
Listen
I was planning to do some crunching last night and early today, but between an unexpected flight delay coming back from New York, and the pleasant surprise of getting Slashdotted about Bookworm, the day is quickly slipping away. I'll give it a go over the weekend, but if anyone else is eager to play, here's a super-raw text dump (the best I could do for getting around the API limit). Update: to be explicit, this covers roughly mid-afternoon Sunday 2/8 through late morning Thursday 2/12, so includes the entire event, but not every #toc tweet.
Update #2: Using the raw text as a starting point, I've generated an XML file listing all of the people who tweeted with hashtag #toc during the conference, and listed each of their tweets. I'll leave it as an exercise to the reader :) to sort by time, or otherwise slice/dice (best visualization among those submitted in the comments by 2/24 at midnight EST gets a free pass to TOC 2010 -- winner chosen by the TOC program committee, and announced 2/26).
Update #3: Unfortunately, the Twitter Search API appears to only have returned the first ~15 or so of each user's #toc tweets (nowhere near enough to include all of the 200+ tweets from the top tweeter, @thewritermama, so that XML doesn't contain all of the tweets in the plain text. I've posted the intermediate XML I used, which contains less data about each tweet and tweeter, but does contain all of the tweets.
Update #4: Anyone interested in the gory details of where the XML came from, I've posted some background over at O'Reilly Labs.
Video: Android meets Eink
Andrew Savikas
February 13, 2009
| Permalink
| Comments (0)
|
Listen
Keeping with the "labs" theme for recent posts, via a tweet from George Walkley:
Lots of talk about devices at TOC - now just saw this, Android + e-ink https://vimeo.com/3162590 #toc
The guys at MOTO labs have hacked together a prototype showing Google's Android operating system running on an e-ink display:
Android Meets E Ink from MOTO Development Group on Vimeo.
The "O'Reilly Bump" and Bookworm
Andrew Savikas
February 12, 2009
| Permalink
| Comments (0)
|
Listen
During his TOC Keynote, Tim O'Reilly talked about how the status he confers through "retweets" on Twitter are really just another form of publishing, not much different from the status we confer on authors by publishing them, or speakers by featuring them (especially at multiple conferences), or hackers by inviting them to Foo Camp.
On the Web, the effects are easily measured, and Liza Daly has a post over at O'Reilly Labs talking about the bump Bookworm got from the association with O'Reilly. Her graph tells the main story, but digging deeper reveals some notable nuggets (emphasis in the original):
Because of this integration [with Stanza], iPhone and iPod Touch users account for 10-20% of all visitors to Bookworm on any given day
- Stay Connected
-
TOC RSS Feeds
News Posts
Commentary Posts
Combined Feed
New to RSS?
Subscribe to the TOC newsletter. Follow TOC on Twitter. Join the TOC Facebook group. Join the TOC LinkedIn group. Get the TOC Headline Widget.
- Search
-
- Events
-
Tools of Change for Publishing Conference
Registration is open! TOC 2009 will take place Feb. 9-11 at the Marriott Marquis in New York City.
- TOC In-Depth
-
The StartWithXML report offers a pragmatic look at XML tools and publishing workflows. Learn more.
Dive into the skills and tools critical to the future of publishing. Learn more.
- Tag Cloud
- Recent Comments
-
- > Keeping things current easily,
> removing crippling copyright
> a...
From "Virginia Open Sourcing Physics Textbook ("Flexbook")" - So this went live on Feb 27th. I got to meet the project lead and posted a...
From "Virginia Open Sourcing Physics Textbook ("Flexbook")" - time for a little update here...
here are some instructions i had written ...
From "iPhone App Outperforms Most Print (Computer) Books This Holiday Season" - @Rich: Links back to the original material should absolutely be included (a...
From "Excerpting Best Practices Hinge on Intent" - Mac,
I agree conceptually on intent, but that's awfully hard to control es...
From "Excerpting Best Practices Hinge on Intent"
- > Keeping things current easily,
> removing crippling copyright
> a...
- Search Archives
-
Or, visit complete archives.
- TOC Community Topics
-
- Publishing News
-
- Latest from O'Reilly Radar
-
- Blogroll Headlines
Tools of Change for Publishing is a division of O'Reilly Media, Inc.
© 2009, O'Reilly Media, Inc. | (707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
O'Reilly Media Home | Privacy Policy | Community | Blog | Directory | Job Board | About