| CARVIEW |
Over the last 2.5 years, we’ve identified a few problematic classes of questions that tend to get asked on our sites. Many of these are documented in our standard set of close reasons: exact duplicate, off-topic, subjective and argumentative, not a real question, and too localized.
However, as we launched the great Super User experiment, a new, previously unknown class of problematic questions emerged — the shopping recommendation.

That is, on Super User we began encountering questions like:
Macbook Air vs. Macbook Pro?
What’s the best dual-band wireless router?
Dell GX280 Processor upgrade?
What RAM should I buy?
Nvidia or ATI video card?
These questions may seem tolerable at first glance. Isn’t it our mandate to help our fellow ewoksusers? But consider the voluminous amount of information you need to even begin properly answering a shopping question:
- What is your budget?
- Where do you live?
- What are your preferences?
- Which alternatives will you consider?
- When do you want to buy?
Let’s say the question asker provided all that information. Fat chance, I know, but let’s pretend for a moment they did — and we were able to provide the perfect, ideal shopping recommendation to them. Even if that was the case, technology moves so rapidly that the best shopping recommendations will be utterly obsolete within a year! What’s the point of a bunch of labor intensive questions that provide only temporary benefit to a limited (some might say Too Localized) audience? There isn’t any. That’s what we concluded, and we explicitly disallowed shopping questions in the Super User FAQ:
Super User is for computer enthusiasts and power users. If you have a question about …
- computer hardware
- computer software
and it is not about …
- videogames or consoles
- websites or web services like Facebook, Twitter, and WordPress
- electronic devices, media players, cell phones or smart phones, except insofar as they interface with your computer
- a shopping or buying recommendation
… then you’re in the right place to ask your question!
However, there is a way to ask these questions that avoids the inherent problems with shopping recommendations. For example, let’s say you wanted — as I did — to buy a point-and-shoot camera that takes good low light photos. So we’re going to ask on photo.stackexchange.com, naturally!
Here’s one way to ask:
Q: What’s the best low light point-and-shoot camera?
A: Canon S90 and Lumix LX3.
Here’s another way to ask:
Q: How do I tell which point-and-shoot cameras take good low light photos?
A: I strongly recommend looking for something with
- a fast lens (2.0 at least)
- reasonable ISO handling (at least 400, but preferably 800)
- the biggest sensor available
The sum of these factors are really critical for low light situations.
The former question provides the path of least resistance: a laundry list of products I can buy without thinking about it too much. But that answer will only be valid for a year at best. The latter question may take some thinking, but its answer will be valid forever … or at least until camera technology somehow shifts beyond lenses and sensors as we know them today. Thus, when it comes to shopping questions, don’t ask us what you should buy — ask us what you need to learn to tell what you should buy.
If I had to summarize our network in a single word, that word is “learning”. People come to our sites to learn about topics they are passionate about. As the old Chinese proverb goes, “Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime.” Every question and answer ultimately should be about teaching and learning — yes, even the shopping ones.
As Stack Overflow grows — or any other Q&A site in the Stack Exchange network, really — there’s a natural pressure to discover and link duplicate questions. The more questions you have, the higher the possibility a given new question isn’t in fact a new question, but a duplicate of an older existing question. Because of this, we’ve continually enhanced the tools for finding, linking, and merging duplicate questions:
One thing I want to be clear about, though, is that duplication is not necessarily bad. Quite the contrary — some duplication is desirable. There’s often benefit to having multiple subtle variants of a question around, as people tend to ask and search using completely different words, and the better our coverage, the better odds people can find the answer they’re looking for. And isn’t that, really, the whole point of this exercise?
Furthermore, it’s OK for duplicate questions to have duplicate answers. While you could argue that the duplicate questions could all be merged into one question with a “master” set of answers, this is kind of irritating from the perspective of the user looking for an answer. Put yourself in their shoes. Instead of finding …
Duplicate Question
—
Duplicate Answer
They have to deal with finding:
Duplicate Question
—
[closed as duplicate of Question] click here to see answers
Now, what other site requires users to do some sort of weird scroll-down, click-here-first to see the answer nonsense on the search results before they will reveal the answer? Oh yes, our old hyphenated pals. Do we really want our site to work like theirs?
Furthermore, I’ve found that the perfect duplicate question is a … bit of a mythical beast. There are similar questions, yes, and so-called “exact” duplicates do happen, but they are kind of rare in my experience. It’s far more common to have many subtle variations of a question. I think that’s OK, because that’s how the world works. Trying to shoehorn a bunch of semi-related things into one arbitrary container in service of some Highlander-ish “there can be only one” rule is ultimately harmful. Remember: while there are aspects of wiki to our system, we are not Wikipedia. There is not one canonical question about every possible subject. Rather, there are many.
In other words, over time, I have learned to stop worrying and love (some) duplication. And you should too.
Here are my official guidelines on question duplication:
- Having one “perfect” form of a question that contains every possible answer to every slight variation of that question is a myth at best and actively harmful at worst.
- Having dozens and dozens of variations of the same question is clearly bad.
- What we want is on the order of 4 or 5 similar-but-not-quite-the-same duplicates to cover all possible search terms and common permutations of the question. It is also OK for these duplicates to have their own answers so people who find them don’t have to click yet again to get to a good answer.
Let me be clear — too much question duplication is bad. Absolutely. You’ll get no argument whatsoever from me on that. But not enough question duplication is also bad. I know this does not sit well with programmers who love to think in binary black and white and cannot abide a single atom of duplicated content in the entire omniverse. But the honest, realistic answer to how much question duplication there should be is … “enough”. Question duplicates aren’t necessarily our enemy. They’re more like our, y’know, frenemies.
So, as always, use your good judgment and please continue to close and merge duplicates as you see fit. However, bear in mind that cultivating and supporting a moderate amount of natural duplication actively helps the community. I wasn’t kidding when I said learn to stop worrying and love (some) duplication. Use the above guidelines and try to find a happy, reasonable medium somewhere in the middle there.
The latest version of the Stack Exchange Creative Commons Data Dump is now available. This reflects all public data in …
- Stack Overflow
- Server Fault
- Super User
- Stack Apps
- all public non-beta Stack Exchange Sites
- all corresponding meta sites
… up to Nov 2010.
This month’s Stack Exchange data dump, as always, is hosted at ClearBits! You can subscribe via RSS to be notified every time a new dump is available.
Please read, this is not the usual yadda yadda! We changed the format of the data dump to include more requested fields, full revision history, and many other pending meta requests tagged [data-dump]. That’s why the dump is so much larger, but we did break it out in individual files per site within the torrent, so you can download just the files you need.

If you’d prefer not to download the torrent and would rather play with this month’s data dump in your web browser right now, check out our open source Stack Exchange Data Explorer. Please note that it may take a day or two for the SEDE to be updated with the latest monthly data dump.
Have fun remixing and reusing; all we ask is for proper attribution.
When I first started working at Stack Overflow, I wondered why the candidate’s work experience is referred to as a CV on Stack Overflow Careers. I honestly thought Stack Overflow might be a European company or maybe they were just being snobs. A resume is your work experience written up on a piece of paper – job sites, employers, recruiters, and everyone else, it seems, uses “resume”.
So, what gives with Stack Overflow Careers? Why CV and not resume? I learned pretty quickly that a CV encompasses your accomplishments in a more detailed format than a traditional resume. In fact, Curriculum Vitae roughly translates as “course of my life”. It’s true that CVs are used widely in academics and medical fields as a way to list accomplishments and credentials that go beyond a specific job role. A CV is updated anytime you have something meaningful to add – maybe it’s the sales from that software you designed or a new qualification or something else awesome that you did. Conversely, a resume is a document that you scrape together when you’re desperately looking for a new job.
A CV is more than just about your job experience and chances are most developers don’t just program at work –they likely have a blog, a website, a side project and other professional passions too. Many developers create viable products while in college or high school. This is terrific experience to show on your CV, painting a more accurate picture of your programming expertise.
While it would be easier for Stack Overflow Careers to use “resume” like everyone else, we think your programming experience is more valuable than a 1 page list of past jobs. Plus, really, we’re saving bytes by the bucket load.
PS – If you are looking for something new for the next course in your life, you might want to keep in mind Stack Overflow is looking to hire more great developers!
As I mentioned in The Horror of No Answer: Revival and Necromancer:
It’s fine — expected, even — for there to be a “long tail” of questions that are too obscure, too narrow, or just plain unanswerable for whatever reason. Sometimes you have to be patient; it takes the time it takes. But seeing the number of zero-answer questions grow by 50% over a 3 month period is definitely concerning.
Part of this is our fault for not adapting the homepage to the massive amount of question activity that Stack Overflow now enjoys. We’re working on it, but it will take some time to figure out the right approach.
The default question ordering on the home page is a simple, flat list of the most recent (n) questions sorted by activity date — where activity is defined as a new answer, an edit, or a new question. Sophisticated, it ain’t, but it has worked well for us up to a certain volume of activity. Stack Overflow is now well beyond that volume.
I asked for help redesigning the Stack Overflow homepage on meta, and the consensus was to keep the same design (for now), but try to show more relevant questions to each user.
We began playing with experimental question weighting algorithms to decide which questions to show to a particular user. Sam Saffron set up a clever little experimental home page where you can have a play with the algorithm client side and see what weightings produce the best fit for you.
As of today, we’ve rolled this change out based on your feedback. On Stack Overflow (and only Stack Overflow) the default home page tab has changed from active to interesting. The goal is no longer to show you a simple flat list of the last (n) active questions — that’s not even possible any more based on sheer question volume — but, instead, to narrow the list to a subset of active questions that we think you will be interested in.
Here’s how it works. Starting with a list of the last 3,000 active questions:
- drop questions containing any of your ignored tags
- drop closed questions if you lack the reputation required to vote for reopening
- drop questions scoring -4 or lower
Next, apply the following score formula to the remaining questions:
| your interesting tags | +1,500 per interesting tag, up to +2,000 total |
| your top 40 scoring tags | maximum of +1,000 per tag (scaled), up to +2,000 total |
| question score | +200 × score, up to +1,000 total |
| total answer score | -200 × score, up to -1,000 total |
| number of answers | -200 × answers, up to -1,000 total |
| number of views | -15 × views, up to -1,000 total |
| question last activity date | -1 × (seconds / 15) |
Count it all up and take the top 90 by score.
We also mix in a few random questions from the last 3,000 — 10% (9) for logged in users and 20% (18) for anonymous users. We’re like DJs trying to spin a mix of songs — some you might know by heart and love, others you might not have chosen for yourself, but could possibly like if you gave them a fair listen.
The resulting change in the homepage is fairly dramatic. Here’s a screenshot of the old Stack Overflow homepage (the active tab) compared to the new Stack Overflow homepage (the interesting tab):
Quite the sea of red unanswered questions, which seems to meet our goal of giving questions which haven’t yet gotten a good answer, more time on the homepage to get one.
You can compare yourself by viewing the old “active” tab at https://stackoverflow.com/?tab=active and comparing that to what you get shown — both as a logged-in user and as an anonymous user.
I’ll be honest with you, this change makes me nervous. It’s like Colonel Sanders mucking around with his magical blend of 11 herbs and spices. But at the same time, the old simple “questions by activity date” homepage default was clearly not working with the 2,000+ questions being asked on Stack Overflow each and every day. Something had to change.
Well, this is that change. Let us know what you think, and feel free to experiment with alternative weightings if you have ideas for ways to further improve upon it.
Blog – Stack Overflow
a programming community exploit
Recently
- Q&A is Hard, Let’s Go Shopping!
- Dr. Strangedupe: Or, How I Learned to Stop Worrying And Love Duplication
- Creative Commons Data Dump Nov 10
- What’s a CV Anyway?
- Stack Overflow Homepage Changes
Categories
- announcement (12)
- API (6)
- Area51 (10)
- ASP.NET (7)
- background (64)
- Beta (17)
- careers (10)
- cc-wiki-dump (23)
- community (145)
- design (91)
- legal (6)
- maintenance (4)
- meta (9)
- misc (2)
- moderators (8)
- podcasts (100)
- security (2)
- server (29)
- serverfault.com (31)
- stackexchange (47)
- superuser.com (29)
Pages
Archive
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
RSS
Flair

podcasts are licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.




