Morgret Designs

Blogroll

Google Webmaster Tool and Data Refresh Questions

December 24th, 2006

Tonight I’ve seen several sites that state “no pages from your site are currently included in Google’s index.” when I was looking in the webmaster tool, yet the site: command shows the expected number of sites in Google’s index. Any insight anyone?
Data refresh or coincidence? About six weeks ago, I took a site that had been static for a year, with javascript navigation and only one page in the index, and put in html navigation and submitted a sitemap. For four weeks, nothing happened. Still an old crawl date, still only two or three pages in the index. Last week 15 pages were in the index, but 14 of those were “very similar to the 1 already displayed”, and there was a crawl date of December 4th in the tool. At that time, I rewrote all of the title tags (putting the section name first, company name second), and wrote unique meta descriptions. This week 17 pages are in the index, and 14 of them are displayed. Yet December 4th is still the last date the Googlebot is said to have crawled the homepage.

Disclaimers.. I am doing this to help a friend, so I wasn’t the one who set it up with the javascript and static content. I have only some access to the site, but don’t have full access to where I can see if the bot has crawled more recently than the webmaster tool states.

Happy Holidays, Merry Christmas, and Happy New Year

December 22nd, 2006

It’s that time of year again. Time to have the family around, have a good dinner, open presents, then pack up the scuba gear and run off to someplace warm for a few days. I hope everyone has a good holiday season, and look forward to writing again next year.

AboutUs.org: Too Much Information?

December 20th, 2006

One tool was not mentioned when people were trying to determine how Matt Cutts found a plethora of domain names registered to the same person. While it turns out nearly all of the sites Matt listed were actually on the homepage of the parent organization, one could have gone to AboutUs.org and found many of those related sites.
AboutUs.org “is a wiki whose goal is to create a free and valuable Internet resource containing information both about websites and other related data. The site was pre-populated with information about many different websites and thousands of updates are now being made by people each day” (from their about page). What their statement doesn’t mention is that the pre-populated information comes from a lot of Whois data, and many of those updates are people going to the site to remove their whois information from the aboutus website.

If you enter a website name that is not in the database, the AboutUs bot will visit that site and instantly create an entry. You can put in a robots.txt entry to disallow the User-agent: AboutUsBot, and visitors to the page will get a notice that the owner has elected not to initialize their AboutUs page with content. Visitors familiar with the site are then invited to edit the page and contribute to the page.

If you enter a website name that is in their database, you get the title, excerpt from the page, address and map of the domain registrant, contact information, and possibly related domains. In the case of the Allied Schools domains that Matt found, many of them were listed in the Related Domains section. You can edit and remove contact information — but it is easy to see prior versions of the website. This page has a summary of their responses to concerns of users regarding all of the information that is displayed.

My first reaction to this site was as negative as the posts on other blogs and the reactions people had when they thought Google had secret ways of viewing domain ownership. As I thought on things a bit, I find it interesting how this one particular aggregation of data has everyone upset — yet people thank Rand for posting A List of Every Website Statistic Publicly Available that includes DomainTools, which is registered to the same address as AboutUs. Even so, the AboutUsBot is now in my robots.txt file.

Search Engine Conspiracy Theories

December 19th, 2006

I got a good laugh from The Daily WTF today. Seems a guy came in late to work at a banner advertising firm and found everyone crammed into the president’s office. The president had done a search on “war banners” and kept staying at Google’s homepage — not even getting to a page that said there were no results. The president and the other employees had spent half an hour discovering that there were no results for ad, media, advertisement, advert, etc. He was convinced Google was out to get him and put small advertisers out of business. The guy who had overslept came in to find the president in a very upset state, but took a closer look at what was happening. The president was using Opera, and had blocked URLs that contained ads, advert, banner, etc. in the URL. Opera then will not load a page with the blocked strings, thus all of the reloading of Google’s main page.

This was amusing, in comparison to what Stacy Williams shared at the Search Ad Buyers Forum at the Chicago SES. Yahoo changed the copy of her ads, did not notify her, and she only found out when her client discovered the completely changed copy. She’s also had keywords listed as active that were receiving no impressions — turned out they had tripped a filter, but she was never notified. She’s had a “technical glitch with the approval process” hang up 60 keywords for at least six weeks.

Stacy also shared her experiences with broad matches gone bad — a bid for “refurbished as/400″ (IBM server) showing for “rebuilt Calcutta 400″ (fishing reel) and “used Sun” showing for “used heat pump”. She found out about some of these by seeing lots of fishy keywords in the server logs. She lists several things you can do to prevent your ads from being changed, and ways to monitor for changes you may not have anticipated.

Blog Tag: Five Things You Don’t Know about Keri Morgret

December 16th, 2006

I haven’t played tag in a long time, but Blog Tag is going around and Bill Slawski tagged me to list five things people don’t know about me.

1. My last name is pronounced More Gray, though I hear distant relatives (of my husband) on the East Coast pronounce it More Gret. It works great as an early warning sign that a telemarketer is calling.

2. I wanted to be an occupational therapist since I was a child — until I got Cs in Anatomy and Physiology and wasn’t able to get into the OT program at my schools of choice. Assistive technology, sensory integration, and the needs of the developmentally disabled are still of interest to me.

3. Being a preschool teacher and dealing with two-year-olds was good preparation for working telephone technical support and dealing with adults throwing tantrums because their computers were broken.

4. I have been a ham radio operator (N6TME) since I was 11 or 12 years old, and every member of my family is still licensed and active in the hobby.

5. My grades in high school algebra would have been higher had I not read Ayn Rand’s Atlas Shrugged through most of the lectures my sophomore year.

My tags are: Dan Zarrella, Roger Johansson, Pamela Heywood, Brian Clark, and Andrew Goodman.

Dan Russell: How Do People Use Search Engines

December 14th, 2006

Dan Russell spoke at BayCHI Tuesday evening regarding how people use search engines. His talk was similar to one he gave to Stanford this March (video archived here), but I did find a couple of things of note.

The March speech only had slides of individual eyetracking sessions, while Tuesday’s speech showed the traditional “golden triangle” heat map of people spending the most time looking at the top left of the screen — but that was only for the first time they viewed the screen. For subsequent visits to the same page, there was considerably less looking in this same area, the heat map was much more distributed among the results. He showed the second heat map, but did not give any more insight into this information.

With regards to individual eyetracking sessions, Russell mentioned how quickly people scanned, and that they often refined their query within 2-3 seconds of their initial result. He emphasized the importance of making sure your titles are written well so that your site won’t be overlooked in the 2-3 seconds before the user decides the results are not applicable. To view the individual eyetracking studies, see the link to Stanford’s HCI seminar above and go to about 42 minutes into the presentation.

Things I’ve wondered about regarding search and viewing SERPs that were not discussed: is tabbed browsing affecting search behavior? I will often open three or four likely results in new tabs before I even visit the first result. Has there ever been an experiment to set the default results at 20 instead of 10? Would that change behavior, or would so many results be below the fold that it would make no difference? I personally have my default set to 50 results, and sometimes still go to the second page.

As Russell also said to the audience, “You are a couple sigma away from the norm (and the fact that you understood that sentence proves my point) and so are your friends.” He also said “You are statistically insignificant” when you look at the vast number of people using Google. I do wonder how trends have changed with the advent of new browsers (especially with the tabbed browsing now in IE7 that is being forced upon the masses) and of bigger monitors with larger resolutions giving us a larger area “above the fold”.

In the Q&A session, Russell was asked if Google had any data about how people viewed the organic listings versus the sponsored listings on the top versus the sponsored listings on the right. As some in the audience got hopeful, he answered “Yeah, we have data on that. Next.”

WayMarkr: Continuous Cameraphone Documentation

December 8th, 2006

Mike Bukhin and Michael DelGaudio spoke at MUM2006 about WayMarkr. WayMarkr lets the user wear a cameraphone (must be a series 60 phone) that automatically takes pictures, adds the GPS location, and uploads them to the WayMarkr server. If the user sets the display to public, anyone can come in and look at a slideshow of the photos. On the second day of the conference, we took a bus tour from Stanford to San Francisco. Here’s the set of photos from that trip, showing where the bus was on the map as each picture was taken.

As a kid, I always thought it would be interesting to take a picture of the same spot in the backyard each day, to see the changes of the days and of the seasons, but never did follow through. The authors mention reviewing photo sets from their daily routine and being able to watch buildings being built, seasons changing, and a variety of things they hadn’t consciously noticed in the past — including their bad posture, and being hunched over a computer too much of the time. There are a variety of implications of this type of media, but I won’t dive into them now. I’m busy setting up WordPress on my own domain, and hope to have my blog transferred shortly.

The Lost Google Tapes: Early 2000 Interviews with Brin and Page

December 6th, 2006

Something interesting to think about as everyone recovers from attending (or reading the live blogging of) the Chicago SES.

John Ince, founder and chief podcaster of PodVentureZone.com, wrote in the December 3rd issue of the SF Chronicle about interviews he conducted with Page and Brin in early 2000. The hard drives containing the original interview were destroyed, but the author recently found the tapes of the interviews at home, and has converted them into a 10-part podcast series. Excerpts of the interviews were published in the Chron this Sunday, and here are some of the more amusing snippets.

“Ince: Do you see yourself as a competitor of Yahoo?
Brin: No.”

<snip>

“Ince: Will you be anything besides a search engine?
Brin: No, with the exception of the kinds of things I mentioned, like navigation aids and things like that. Although we’re open to develop across different axes of search engines or portals.”

Regarding the transition from Ph.D. candidate to entrepreneur, Brin mentioned the learning to deal with the organizational challenges of having over 70 people with the company, keeping everyone productive and focused and the like.

On getting angel funding from Andy Bechtolsheim: “…we were showing (a demo on a laptop) to him and he said ‘Oh we could go on talking, but why don’t I just write you a check?’ “

Splog and Scraped Site Detection

December 4th, 2006

In an earlier post, I mentioned how my mother (Esther) had an encounter with euphorbia sap that resulted in her visiting the emergency room, and that there was an article about her experience written by the Associated Press. I configured a Google Alert to tell me whenever esther euphorbia appeared. About half of the results have been legitimate newspapers, the others have been splogs/scraper sites with just the first paragraph of the article included (which I then report as spam in the Google Webmaster Console).

I went to the root directory of one splog site, and there was no index.html configured, just a list of several dozen folders, each of which had a different splog. One folder, however, was rsstoblog, an install directory for a splog generation program. Searching the name, one finds the features of this particular software, as well as a host of other similar software programs.

Features of rsstoblog include (per the website):

  • Eliminate footprints that the search engines can use to track you down. Random Posting Times - Make posts at varying times to your blog.
  • Filter RSS feeds for keywords
  • Our software encourages natural linking by pulling the RSS feeds and content for your blogs from many different sources!
  • No writing needed! With this possibility RSS offers, bloggers today do not have to necessarily write a single word of content yet fill their blogs with fresh, unique content every single day!

That explains why I was seeing the article display on splogs for “sap” “flower arranging” “flower delivery” and “nose job report” — the text in the first paragraph of the AP article included sap, flower, and nose, and these splogs were set to find posts including those keywords.

Based on this, I would suggest setting up some Google Alerts with two or three odd, unique words or phrases from the first paragraph of some of your blog posts. This should be an easy way to scan for both splogs and other issues of duplicate content.

Light Posting This Week

December 3rd, 2006

And it is not because of SES. I’ll be attending the International Conference on Mobile and Ubiquitious Multimedia at Stanford the first part of this week. When I was in graduate school, the focus of some of my courses was on user interface design, social learning, and using technology for learning. I’m looking forward to seeing the latest developments regarding these topics as it relates to mobile computing.

Next Page »

Most Popular

Recent Posts

Archives

Sky3c Sponsored by Web Hosting
Copyright 2007 Keri Morgret