Steve Taylor photo

Double slashes in Analytics URLs

double-slash

I’ve just been dealing with an issue on a site where Google Analytics is logging a lot of pages twice, once normally and once with a double slash—“//”—at the end.

Obviously this is worrying. If Google is seeing the same page in two “places” via two technically different URLs, duplicate content penalties and PageRank squandering are distinct possibilities. It also seems to break a lot of the Analytics “Site Overlay” functionality.

Here I’m going to go through what I’ve done to isolate the cause of the issue, and approaches to fixing it.

Ruling out the obvious

Naturally I combed through our sitemap.xml (generated by the Google XML Sitemaps WordPress plugin)—no double slashes in there.

Also, I did a site:domain.com type search in Google, and searched through all the returned URLs. No double slashes there either. This suggests that the worst possibility is thankfully not an issue—Google doesn’t appear to be indexing multiple versions of the same pages. It appears to be an Analytics-specific thing.

Possible cause #1: Bad incoming links

I don’t think this is the issue. I pinpointed the referring URLs for a few of the double-slashed entries in our Analytics, and the links are fine.

Possible cause #2: Bad .htaccess redirects

There are quite a few 301 redirects in our .htaccess file, because this site is a revamp where many URLs have changed. I was worried that some might be badly formed and redirecting with extra slashes. However, the above test pretty much rules this out, too. Clicking on the referring links above went through fine. If a redirect was rewriting external links, this would have been visible in actually clicking through that link in a browser.

Probable causes: A single bad internal URL and an Analytics mystery

I used Xenu to check internal links, and found one instance of an internal link with double-slash at the end. It was the top-left link back to the front page on the WordPress login page of a sub-site that runs a separate WP installation from the main site in the root.

This seemed to point the finger at the generally excellent qTranslate plugin—which is installed on this sub-site, but nowhere else. The “back to blog” links on other WP installations were fine.

As part of handling multiple language versions of the same WP content, qTranslate uses URL suffixes, such as /de/. This “back to blog” link is output with the WP bloginfo('url') function (followed by a hard-coded trailing slash). This function returns the blog URL entered via WP’s settings, which should be entered without a trailing slash. There was indeed no trailing slash in our setting, so it seems that qTranslate’s filtering of the bloginfo('url') function must be mistakenly adding a trailing slash where one isn’t expected.

I’ve no idea how, but it seems that this single instance of a double-slash was being picked up by Analytics, and was proliferating through other logged data.

Solution #1: Fixing the trailing slash in WordPress

First step was to remove the extra slash from the results of the bloginfo('url') function. Looking at the qTranslate forums, others seem to have noticed this issue. Hopefully it’ll be fixed soon, but until then, placing this code in your WordPress theme’s functions.php file should make sure this function returns the right URL:

function fixblogInfoURL( $result = '' ) {
	if ( substr( $result, -1 ) == '/' ) $result = substr( $result, 0, -1 );
	return $result;
}
add_filter( 'bloginfo_url', 'fixblogInfoURL' );

Solution #2: An Analytics filter

This Analytics support thread ends with a suggestion to include a search/replace filter in your Analytics profile. As all my testing seems to show this is a problem internal to the Analytics system, this seems like a good approach. I’ve only just set this up, so I’ll report back if any problems come up with this. Please let me know how it’s worked (or not) for you!

Solution #3? Rewriting

Searching on this issue will bring up multiple suggestions for .htaccess rewrite code to replace double slashes with single slashes.

I can see the logic in this, but as my testing seems to indicate that this is a case of one or two bad URLs mysteriously spreading through the Analytics system (and not through Google’s index or anywhere else), it seems sufficient to isolate those bad URLs, fix them, and add an Analytics filter.

Again, I’ll report back if this rewrite approach ends up being necessary; and again, let me know your experiences if they differ.


UPDATE: I’ve decided it’s best to include a .htaccess rewrite just in case. Thomas Scholz suggests one way below. I’ve actually ended up using this one at yoast.com.

Postscript: The Default Page setting

9/11/09: Thanks to Gavin Doolan’s comment, this problem has become a little clearer. On the Google Analytics profile in question, the “Default page” setting was set to “/” (one forward slash). It seems now this was wrong, and at least part of the problem.

Now, since I implemented the above fixes, our stats stopped registering double-slash URLs completely (even with the default page set to “/”). They’re still in the stats from before the fixes were applied, but hits dropped to zero after the fixes. Maybe the rewrite was saving Analytics from registering double-slash URLs. Maybe there was some other factor at work other than the default page setting (which certainly seems possible given the confusion I’ve seen on forums about this issue).

However, even with the double-slashes not registering in terms of hits, the Site Overlay feature just wasn’t working. It registered no hits on any links—and when you hovered over any link, there on its end was the dreaded double-slash. Now that the default page is left blank, Site Overlay is back to normal. I suspect and hope that this is the last nail in this issue’s coffin!

10 comments

  1. Steve Taylor avatar Steve Taylor

    Many thanks Thomas. I take it the .htaccess directives are deemed the better solution? As I mentioned above, I’ve not included this for now, but I might add it to be safe. Good to have a link to reliable code for this.

  2. Steve Taylor avatar Steve Taylor

    Thomas, I am slightly confused by the .htaccess code provided on your post. Maybe it’s explained in the German copy, but why one or more caps-only letters at the start? (i.e. ^[A-Z]+). And why the space character before the first forward slash?

  3. Yes, .htaccess is the better way. It catches images and other non PHP files too. And it’s faster.

    I wrote the plugin just for people without mod_rewrite.

  4. THE_REQUEST contains the full request line (method, request URI and query string), e.g.:
    GET /foo//bar?find=a//b

    So ^[A-Z]+\ matches the request method (POST, GET … DELETE ;)).

    And no, I haven,’t explained this in my blog post. Maybe I should.

  5. Steve Taylor avatar Steve Taylor

    Gotcha, thanks. I guess I’m just used to REQUEST_URI and so on…

  6. Also double check your Google Analytics default page setting.

    If you use a / forward slash sometimes it can be represented in GA as // incorrectly.

    The default page field is for websites where say /index.php and / are the same. You would enter /index.php so it would be aggregated together with / and pageviews would be combined.

  7. Steve Taylor avatar Steve Taylor

    Gavin – many, many thanks. It looks like this is probably a large part of the problem. I’ve added an update to the post. Thanks again!

  8. Joe O'Connell avatar Joe O'Connell

    Kudos and thank you! I recently encountered a similar problem that started “all of a sudden” in my analytics account. Sure enough, it started the day I edited my site profile’s default page setting (among many other things I modified that day). This post clued me to correlate, and now it is fixed.

  9. tom3k avatar tom3k

    oh my!

    could it be? a solution to one of those problems if been scratching my head over for about… oh a year now??!

    guess i wont know until 20 hours from now, but this looks promising!

    thanks regardless!

    mid comment update:

    CURSES!

    the one time that i figure this out, turns out ga is updating their system (as has been stated on the site for a week now)…

    We’re sorry.

    We are currently undergoing system maintenance.

    Please try again later – we’ll bring the system back up as soon as we can.

    LOL!

    but yea, this looks very promising :)

    thanks again!

Leave a comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>