CARVIEW |
Vanessa Fox

Vanessa Fox, called a “cyberspace visionary” by Seattle Business Monthly, is an expert in understanding customer acquisition from organic search. She shares her perspective on how this impacts marketing and user experience at ninebyblue.com and provides authoritative search-friendly design patterns for developers at janeandrobot.com. She’s also an entrepreneur-in-residence with Ignition Partners and Features Editor at Search Engine Land. She is co-chair for O’Reilly Found, a conference for web developers about SEO. Vanessa speaks regularly at conferences and corporate events and has written for a number of publications. She previously created Google’s Webmaster Central, which provides both tools and community to help website owners improve their sites to gain more customers from search. She was recently named one of Seattle’s 2008 top 25 innovators and entrepreneurs.
Wed
Apr 15
2009
Practical Tips for Government Web Sites (And Everyone Else!) To Improve Their Findability in Search
by Vanessa Fox | comments: 18
In an earlier post, I said that key to government opening its data to citizens, being more transparent, and improving the relationship between citizens and government in light of our web 2.0 world was ensuring content on government sites could be easily found in search engines. Architecting sites to be search engine friendly, particularly sites with as much content and legacy code as those the government manages, can be a resource-intensive process that takes careful long-term planning. But
two keys are:- Assessing who the audience is and what they're searching for
- Ensuring the site architecture is easily crawlable
Crawlability Quick Wins
This post is about quick wins in crawlability. In many cases, ensuring crawlability also ensures accessibility (particularly access via screen readers). From this standpoint, many government web sites have an advantage over other sites since they already build in many accessibility features. Creating search-friendly sites also improves usability and user access from mobile devices and slow connections. So forget everything you may have heard about how you have to sacrifice user experience for SEO. SEO done right facilitates deeper audience engagement, makes it easier for visitors to navigate and find information on the site, and provides access to a wider variety of users.
Use XML Sitemaps
Create XML Sitemaps that list all the pages on the site and submit them to the major search engines.
Why is this important? Many government sites have poor information architecture. Ideally each page of the site should have at least one link to it. This helps users navigate the site and helps search engines find all of the pages. Long term, these sites should revamp their navigational structure so that at least one link exists to every page. Since that may take some time to implement, an XML Sitemap can function in the meantime to provide a list of all pages for search engines to crawl.
Government sites have already made great progress in search by using XML Sitemaps.
The Energy Department's Office of Science and Technology (OSTI) implemented XML Sitemaps protocol with great success. "The first day that Yahoo offered up our material for search, our traffic increased so much that we could not keep up with it,' said Walt Warnick, OSTI's director.
If possible, provide an HTML sitemap as well, which provides a browsable navigation to site visitors. Below is a good example of a browsable HTML sitemap on nih.gov:
Don't block access to content
Make all content available outside of a login, registration form, or other input mechanism. Search engine crawlers can't access content behind a login or registration. If the content requires the visitor to enter an email address or otherwise provide input before accessing it, it won't show up in search results.
tags: google, search, xml
| comments: 18
submit:
Tue
Mar 24
2009
Transforming the Relationship Between Citizens and Government: Making Content Findable Online
by Vanessa Fox | comments: 9
Thursday on this blog, Congressman Honda asked, "how can congress take advantage of web 2.0 technologies to transform the relationship between citizens and government?" He noted that "A dramatic shift in perspective is needed before that need can be met. Instead of databases becoming available as a result of Freedom Of Information Act requests, government officials should be required to justify why any public data should not be freely available to the taxpayers who paid for its creation." He asked for input on what web 2.0 features he should add to his website to take advantage of today's online world.
The most important feature government web sites can add isn't really feature at all. But it would absolutely transform the relationship between citizens and government and make an amazing array of public data available. What's this magic feature?
Make government web sites search engine friendly.
How we look for information
Search is the primary navigation point for the web. Often when citizens look for government information, they start at a major search engine. They don't think to themselves, I need some information on vitamins, so I'll just go on over to the Office of Dietary Supplements at https://dietary-supplements.info.nih.gov. And then I need to make sure I'm eating a balanced diet, so I'll just check out https://www.nutrition.gov from the National Agricultural Library. And before I head to the grocery store, I'll make sure I understand how nutrition labels work from the information provided by the Center For Food Safety and Applied Nutrition at https://www.cfsan.fda.gov. Mostly, they go to Google and type in [food labels]. And in some cases, this works perfectly and the information appears.
But when information from government web sites doesn't show up on the first page of results for those searches, the information may as well not exist at all. For instance, an amazing amount of data exists from the U.S. Census Bureau, but it's inaccessible from search engines because it's locked behind JavaScript forms and the content itself doesn't use language that searchers would use. If I search for [98116 census data], results from census.gov are nowhere to be found.
Obstacles to being found in search engines
One problem is that the U.S. Census Bureau pages don't use zip codes to denote regions. They use tract numbers. Even if the pages were written in plain language searchers might use, search engine crawlers couldn't get past the JavaScript forms to access the pages.
Try doing a search using the same terminology as the U.S. Census Bureau, and you start to see the problem with the site's findability. Take [census track 97.02]:
None of those results lead to these handy details:
In addition to being buried behind JavaScript and containing little language people would actually search for, it's hidden in a popup with a URL like this: https://factfinder.census.gov/servlet/IdentifyResultServlet?_mapX=281&_mapY=216&_latitude=&_longitude=&_pageX=442&_pageY=554&_dBy=100&_jsessionId=0001cv7n8rWxjslrmI9aRw5nr-V:134a7lbrs">https://factfinder.census.gov/servlet/IdentifyResultServlet?_mapX=281&_mapY=216&_latitude=&_longitude=&_pageX=442&_pageY=554&_dBy=100&_jsessionId=0001cv7n8rWxjslrmI9aRw5nr-V:134a7lbrs
The server appends a session ID to the end of the URL (the portion beginning with "jessionsId"), which is tied to an individual visitor session and times out after 60 minutes. If I share that URL on a social media site, email, or in this blog post, anyone who tries to visit it just gets a "session as expired" message. It goes without saying that this kind of URL can't be indexed by search engines no matter how sophisticated they become.
tags: gov2.0, open government, seo, web 2.0
| comments: 9
submit:
Thu
Jan 22
2009
Making Site Architecture Search-Friendly: Lessons From whitehouse.gov
by Vanessa Fox | comments: 11
Guest blogger Vanessa Fox is co-chair of the new O'Reilly conference Found: Search Acquisition and Architecture. Find more from Vanessa at ninebyblue.com and janeandrobot.com. Vanessa is also entrepreneur in residence at Ignition Partners, and Features Editor at Search Engine Land.
Yesterday, as President-elect Obama became president Obama, we geeky types filled the web with chatter about change. That change of change.gov becoming whitehouse.gov, that is. The new whitehouse.gov robots.txt file opens everything up to search engines while the previous one had 2400 lines! The site has a blog! The fonts are Mac-friendly! That Obama administration sure is online savvy.
Or is it?
An amazing amount of customer acquisition can come from search (a 2007 Jupiter research study found that 92% of online Americans search monthly and over half search daily). Whitehouse.gov likely doesn't need the kind of visibility that most sites need in search, but when people search for information about today's issues, such as the economy, the Obama administration surely wants the whitehouse.gov pages that explain their position to show up.
The site has a blog, which is awesome, but the title tag, the most important tag on the page, has only the text "blog". Nothing else. Which might help the page rank well for people doing a search for blog, but that's probably not what they're going for. This doesn't just hurt them in search of course. It's also what shows up in the browser tab and bookmarks.
The site runs on IIS 6.0. Does the site developer know about tricky configuration that makes the redirects search engine-friendly?
Search engines are text-based, so they can't read text hidden in images. Some whitehouse.gov pages get around this issue well, by making the text look image-like, but leaving it as text, such as below.
However, other pages have text in images and don't use ALT text to describe them. (This, of course, is an accessibility issue as well, as it keeps screen readers from being able to access the text in the images.) An example of this is the home page, which may be part of why whitehouse.gov doesn't show up on the first page in a search for President Obama.
There are all kinds of technical issues, big and small, that impact whether your site can be found in search results for what you want to be found for. (whitehouse.gov using underscores rather than dashes in URLs, the meta descriptions are the same on every page...) Probably the biggest issue in this case is the lack of 301 redirects between the old site and the new site. When you change domains and move content to the new domain, you don't want to have to rebuild the audience and links all over again. (Not that Obama or whitehouse.gov will have a problem with attracting and audience, but we all can't be president!) When you use a 301 redirect, both visitors and search engines know to replace the old page with the new one.
In the case of change.gov, it's unclear if they intend to maintain the old site. The home page asks people to join them at whitehouse.gov, but all the old pages still exist (even the old home page at https://change.gov/content/home).
And in many cases, the same content exists at both change.gov and whitehouse.org (see, for instance, https://change.gov/agenda/iraq_agenda/ and https://www.whitehouse.gov/agenda/iraq/).
As Matt Cutts, Googler extraordinaire pointed out, give them a few days to relax before worrying so much about SEO. And I certainly think the site is an excellent step towards better communication between the president and the American people. But not everyone has the luxury of having one of the most well-known names and sites in the world, so the technical details are more important for the rest of us.
If you want to know more about technical issues that can keep your site from being found in search and tips for making sure that you don't lose visibility in a site move, join us for the O'Reilly Found conference June 9-11 in Burlingame. And if you're in Mountain View tomorrow night (Thursday, January 22nd), stop by Ooyala from 6pm to 9pm for our webdev/seo meetup, and get all your search questions answered. Hope to see you there! (Macon Phillips and the whitehouse.gov webmasters are welcome, but my guess is that they're a little busy.)
tags: publishing, search, seo, web 2.0, whitehouse.gov
| comments: 11
submit:
Recent Posts
STAY CONNECTED
BUSINESS INTELLIGENCE
RELEASE 2.0
O'Reilly Home | Privacy Policy ©2005-2009, O'Reilly Media, Inc. | (707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.