Fun with honeypots

I’ve been getting more interested in honeypots recently. This past spring, I setup a honeypot to learn more about what folks do once they successfully brute-force login to an SSH server. The concept was simple, setup a linux VM with common usernames and passwords (i.e. mysql/mysql, user/user, admin/admin, etc.) and wait to see what happens.

I created an isolated bridge network on my linux server, then setup a CentOS VM inside KVM. I used iptables to rate-limit the number of outbound connection to only 2/minute, to prevent anyone who logged into the honeypot from using my VM to do much damage to anyone else on the internet. I also used the iptables NFLOG target to save a copy of all packets to/from the VM so that I could analyze the traffic later.

I needed some way to monitor what happened within the VM without tipping off whoever was logging in to the VM that they were being monitored. So I turned on system call auditing as well as TTY auditing. Normally these log messages would be dumped out to /var/log via syslog, which would alert someone to the fact that everything they were doing was being logged and might cause them to cover their tracks. To prevent any of this logging, I modified the syslog configuration to suppress the audit logs from going to log files in the VM and redirected them to a serial port in the VM that was connected to a log file on the VM server host. This allowed me to monitor all system calls made by software they installed as well as anything they typed on the terminal in their ssh session. I wrote some python scripts to filter through the data and pull out just the details I was interested in.

In addition to all the monitoring within the VM, I also setup sslsplit and a fake certificate authority to capture any HTTPS traffic that left the VM. All TCP 443 traffic exiting the VM was sent to sslsplit, which performed SSL man-in-the-middle to decode the traffic. The fake certificate authority was added to the trusted CAs within the VM, so there wouldn’t be any security warnings.

With most of the setup part of the project complete, I enabled the NAT rules to forward SSH traffic over to the VM. Within 4 hours I had my first login. Over the course of the next week, all of the user accounts I had setup had been brute forced.

It was interesting to see what the folks who logged into my VM were up to, but it wasn’t too surprising. The people who accessed my VM were mainly interested in using the SOCKS proxy built-in to SSH to browse the internet. They did ignore the SSL warnings in their browser and continue to SSL websites. One installed an IRC bot. One installed their SSH brute-force tool and attempted to scan for more victims. Another attempted to run local privilege escalation exploits that did not apply to the version of CentOS my VM was running.

When I get some more time, I’d like to work more on the networking plumbing for the honeypot VM. Currently, I only run this from my home IP address and am limited to only one VM. I’d like to be able to cycle IP addresses more frequently, so my plan is to purchase a few cheap linux VPS systems and add a few IP addresses to them. I wouldn’t run any of the honeypot software on the VPS, but instead install OpenVPN and forward all traffic for the secondary IPs back to a central honeypot router/firewall running on my home network. From the honeypot router, I’d use NAT to forward the traffic to and from the individual VMs, making the OpenVPN connection and honeypot router transparent to anyone interacting with the honeypot.

Honeypot router concept

Honeypot router concept

Once that is all setup, I’d like to test Tor exit nodes to see which operators are inspecting traffic. My plan is to setup several IP addresses and VMs, then login to each of them as root, over telnet or FTP, from specific Tor exit nodes and see what happens.

Share
Categories: Uncategorized Tags:

Solar monitor with jqPlot and TED5000

I have a solar powered home. Well, sort of. It is a grid-tied solar system, meaning when it is sunny outside our solar panels produce more power than we use and we bank that power with our utility company. At night time, we draw from the power we banked during the day.

I have a TED5000 energy monitor to keep tabs on how much power we are producing and how much we are consuming. The TED5000 is great, but the interface is a bit lacking for me. It has all the information I need, but it is buried in several different screens. Answering the questions how much solar power did we generate today, or what was our net usage, or did we ever produce more than we used during a day are not immediately visible on the interface.

I had been wanting to teach myself jQuery, so when I found jqPlot it seemed like the right time to dive in and make a new interface. The result, a real time view into our daily energy production and consumption.

Daily energy graph

The time of day is represented on the X-axis and is updated every 5 minutes with the latest data from the TED5000. The left Y-axis shows the average power during the 5 minute period and is related to the green and red lines on the graph. The right Y-axis shows the total energy produced and is related to the area curves on the graph. Green represents solar production and red represents energy consumption. Finally, the number in the upper-left of the graph is updated every 5 seconds and displays the current net power.

I still use the TED5000 for its monthly estimates, but the new jqPlot powered display is a nice new way to see the data.

Share
Categories: Uncategorized Tags:

Hardening WordPress

This blog is run using WordPress. WordPress does not have the best record for having bug free software. To make sure esev.com doesn’t get overrun by viruses, I’ve taken a few additional steps to secure the site. All these steps follow the simple idea that, if it isn’t needed for an average viewer of the blog, disable it.

1. Allow only http GET requests
Most of the changes to a WordPress blog happen with POST requests. By limiting the server to only servicing GET requests, very few modifications can be made to the blog. Of course, this means that none of the administration functions work. More on that in a bit.

2. Deny access to the administration pages
Most of the administration pages are stored in the wp-admin directory. These administration pages allow the blog owner to create new blog posts, add plugins, and customize the site. By denying access to the administration pages, nobody can use those pages to make changes to the blog.

3. Deny access to the login page
Again, if nobody can login to the blog, it’ll make it much harder for anyone to make changes to the blog.

4. Use an external comment system
The built-in comment system requires use of http POST requests. Those were disabled by #1. Using the built-in comment system can lead to a lot comment spam to. Use a comment provider, like Discus or IntenseDebate and you’ll be handing off the spam filtering to them.

With the blog locked down tightly using the above recommendations, it becomes hard to make any changes, even for the blog’s administrator. To allow an admin to access the blog, configure the web server to require SSL and http digest authentication for any action that could modify the blog.

To configure this for Apache, first setup the digest authentication:

    AuthType Digest
    AuthName "esev"
    AuthDigestDomain /blog/wp-admin/
    AuthDigestProvider file
    AuthUserFile /path/to/htdigest.password/file

Then configure the additional restrictions. To limit the web server to only accepting GET requests add this:

    <LocationMatch "^/(?!(blog/(wp-cron|index)\.php))">
        <LimitExcept GET>
            Require valid-user
        </LimitExcept>
    </LocationMatch>

If a request for a request, other than a GET, arrives at the web server, the client is presented with a http digest authentication dialog. Without the proper username or password, these requests will be denied.

Access to all of the administration pages should be denied. The following configuration section for Apache takes care of this, and allows the blog administrator to bypass the restrictions by logging in.

    <LocationMatch "^/+(blog/+(wp-login\.php|wp-admin)|$))">
        Require valid-user
    </LocationMatch>

Sure, the http authentication dialog box looks a bit ugly, but it prevents anyone without the proper user and password from accessing any content that isn’t needed. Alternatively, something like mod_auth_cookie_mysql could provide a nice login interface for the administrator.

I don’t think this is a bullet-proof way to keep a WordPress site safe, but it should prevent any automated tools from hijacking your blog.

Share
Categories: Uncategorized Tags:

Updating esev.com’s SSL certificate

The SSL certificate on esev.com was updated today. I get the SSL certificates from StartSSL, mainly because they are free and trusted by most browsers. StartSSL only needs to validate your email address and that you are the owner of the domain, then you’re free to create as many certificates as you need.

So I don’t need to look it up again next year, here is the one-liner for generating the server’s certificate:

openssl req -new \
    -newkey rsa:2048 -nodes -keyout esev.20130825.key \
    -out esev.20130825.csr
Share
Categories: Uncategorized Tags:

Front-end HTPC hardware: No perfect solutions

I’ve been searching for a while for a perfect front-end for my home automation and entertainment system.  In my setup, the front-end system needs to do the following

  1. display media on the tv over HDMI
  2. send digital audio to the receiver
  3. accept input from a remote control
  4. handle HD content streamed over the network
  5. run quietly and use little power

The front-end doesn’t need to have any storage, TV tuners, or DVD/Blu-ray drives.  That is all handled elsewhere in my setup.  My perfect front-end system would have these features

At least four USB ports
I have two RF remotes and one Bluetooth adapter plugged into each of my front-ends.  My primary RF remote is a media-center type remote – with a numeric keypad and play/pause style controls.  My secondary remote has a QUERTY keyboard and a touch pad mouse – used when surfing the web.  I also have a USB Bluetooth adapter.  I’d like to have at least four USB ports to support these devices and anything else I need to add in the future.

Suspend to RAM with USB wake-up
On average, I probably only watch about an hour of TV a day.  That means twenty-three hours of the day the front-end system is sitting idle.  I’d like to be able to put the computer into a low power mode (S3 – suspend to RAM) when it is not being used.  I’d also like to be able to wake-up the computer with a remote control rather than needing to push the power button.  In my setup, I have an RF remote control with a USB dongle plugged into the front-end.  I need the USB port to stay active when the computer is suspended so that it can wake-up when a button is pressed on the remote.

NVIDIA GPU powerful enough to de-interlace at 1080p
My home automation and entertainment system is running Linux.  Today, the only graphics card vendor to fully support hardware acceleration under Linux is NVIDIA (see VDPAU).  Not all NVIDIA GPUs are created equal and I want one that supports de-interlacing at 1080p resolutions.

HDMI and digital audio out
My TVs have HDMI connections for video, but I don’t necessarily want the audio to go to the television.  Rather, I’d like the audio to go to my receiver.  The front-end system needs to have both an HDMI port and a digital audio out port.  I’d prefer a coax audio out over a fiber-optic audio out because I don’t have to worry about pinching and breaking a coax cable.

Network booting and Wake-on-LAN
I don’t want to have hard drives on my front-end systems.  I’m not doing any recording on these systems and a hard drive contributes to power use – and it needs to be backed up.  The network card in the front-end needs to support PXE booting.  This way I can store the OS on the back-end and easily update it and keep proper backups.  I’d also like the network card to support wake-on-lan (WoL).  If I ever upgrade the software, or lose power, I need my back-end server to start first, then send the wake-on-lan packet to each of the front-end computers.

Gigabit Ethernet
I’m streaming HD content from the front-end and also booting over the network. I don’t want to slow down my wireless network with this traffic or have wireless interference disrupt my media.  I’d like to have a Gigabit Ethernet card in the front-end.

At least 4GB of RAM
I’m not putting a hard disk into my front-end system.  There will be no swap space and everything will need to be stored in RAM.  I’d like to have at least 4GB of RAM in a front-end system.

Bluetooth
I have Bluetooth USB dongles already, but it would be nice to have Bluetooth integrated right on the computer to I don’t have to have yet another USB dongle sticking out of the comptuer.

Serial port(s)
Yes, serial ports are old technology, but my television and receiver can be controlled over a serial interface.  In my experience, this is much more reliable than IR.  I’d like the front-end to have at least one serial port.  Two would be preferred.

IR Output
Several HTPCs come with an IR input and a media center remote.  I’d rather use RF for input so the PC can be out of sight. IR output is what I’d really like to see on an HTPC.  This is needed to control DVD players/gaming consoles and every other device that cannot be controlled via ethernet or serial.

 

I haven’t been able to find any retail HTPC computers that have everything I’d like in a front-end system. The NVIDIA Next Generation ION (aka ION2)based computers from Asus and Zotac come close, but most lack bluetooth, serial ports, and IR Outputs. It is also hard to tell which motherboards support network booting and wake on USB/LAN.  Hopefully the next generation of HTPC systems will have more of what I’d like. For now I’ll stick with the Zotac ZBOX HD-ID11 and add a bluetooth dongle and the GC-100 for serial and IR.

Share
Categories: Uncategorized Tags:

One Server: Researching the hardware

Using my list of requirements, I set out to find the hardware for my new server.  I was building this from scratch so at minimum my purchase list needed to include

  1. hard drive storage
  2. server case
  3. motherboard, RAM & CPU

Hard drive storage
I decided to focus first on the requirements for the fileserver side of the project.  Recall that I was planning for 16TB of storage space.  At the time, the largest consumer hard drives were 2 TB.  I also wanted to be able to support multiple drive failures and be able to replace the drives without shutting down the system.  That meant I needed at least 10 hard drives.

When researching the hardware for this server I came across a good blog post from Adaptec about real life RAID reliability.  That article compared the reliability of  RAID-5 and RAID-6 arrays and showed that a RAID-6 array should last 172 times longer than a RAID-5 array.  Reliability was important to me on this project, so I decided to go with RAID-6.  The Adaptec article only considered enterprise grade drives. I planned to build this server with consumer grade drives.  Therefore, as a precaution, I chose to add two extra drives as hot spares.

A RAID-6 array with 10 drives was likely to run slow.  So I did some more searching and came across RAID-60.  RAID-60 combines the redundancy of RAID-6 with the speed of striping found in RAID-0.  However, to get 16 TB with RAID-60, and have two hot spares, I now needed 14 hard drives.  Six drives for each of the two RAID-6 arrays and two hot spares.

RAID-60 with 8 drives. Image from Wikipedia

I wanted to make sure the fileserver would run quickly so that I could stream video from it while MythTV was also recording new programs and all the virtual machines were also running.  I thought running everything off one set of storage drives might be too much, so I decided to split the VM storage from the NAS storage.  That meant adding additional drives.  I had four 1TB drives from my previous NAS, so I decided to use them for storing the VM images.

That put the total number of hard drives needed at 18.  This was shaping up to be quite a storage server!  The next task was determining how to fit that many hard drives into a computer case.

Server Case
I wanted the server to be able to stay running while I replaced a failed drive, so I needed a case that accommodated hot swappable drive bays.  I considered putting the drives into some six-bay external drive enclosures, but decided that would get too expensive and end up using more power then was needed.  Plus, I could just see the cables getting disconnected between the external enclosures and the main CPU.

No tower-style case that I could find would hold that many drives, so I looked for rack mountable cases.   To fit 18 drives, a 4U rack mount case was needed.

Motherboard, RAM & CPU
I wanted to be able to expand this system in the future, so when choosing the motherboard I focused on server boards that supported dual CPUs.  My plan was to put the system together with one CPU, and if needed, add another CPU later.  I also needed to find a motherboard with multiple network interfaces, and plenty of PCI-express slots for adding RAID cards.  Since reliability was important to me, I focused only on motherboards that supported ECC RAM.  Form factor wasn’t a big issue for this system as it was being placed into a rack mount case with plenty of space.

For the CPU, I needed a processor that supported VT-d.  VT-d processors support mapping cards plugged into PCIe slots directly into virtual machines.  My goal was to create a virtual machine for the fileserver and map the RAID card directly into that VM.

Another goal of mine was to make the new server easy to administer.  I didn’t want to have to find a spare keyboard, mouse, and monitor and plug them all in when there was trouble.  The solution, IPMI.  A motherboard with IPMI would allow me to remotely control the keyboard, mouse, video and even attach a remote DVD-ROM to perform an OS install.  It is basically a built-in KVM over IP.  I can even remotely reset the computer using IPMI.

Parts list
I ended up purchasing the following components for this system

  • 14 x 2 TB hard drives (5 x Hitachi HDS72202, 3 x SAMSUNG HD203WI, 6 x WDC WD20EADS)
  • Norco RPC-4220 4U rackmount case with 20 hot swappable drive bays
  • Supermicro X8DTi-F motherboard with 3 PCIe 8x slots and IPMI
  • 24Gb ECC Registered DDR3 1066 RAM
  • Intel Xeon E5506 Nehalem-EP 2.14 GHz processor
Share
Categories: Uncategorized Tags:

A home server using VMware ESX and ZFS

Server CaseIf you are like me, and you like technology, you probably find yourself wanting to try the latest operating systems and software.  You also likely have a router for your network, a NAS device for your storage, and maybe a web server for a blog or wiki.  After a while, you end up with the situation shown in the picture below, a closet full of servers.

The picture below is of my server closet from 2004.  I had a custom Linux router, a NAS box, a VoIP server, and several other computers for trying out operating systems and software.

Old Server ClosetMy setup continued that way for several years.  It took up a lot of space.  It was loud.  It was hard to upgrade because I needed to physically sit at the computer to reload the operating system.  And it used a lot of power.

In 2007 I started to use virtualization to cut down the number of computers and make controlling and upgrading them easier.  I was able to get the number of computers down to only two: A NAS for storage, and a Linux computer for running VirtualBox.  Everything else I needed could run in a VirtualBox guest.

This worked well until 2009 when I started to run out of storage on my 3 terabyte NAS server.  As I was planning to replace it, I decided to try combining the two servers into one.  I wanted a server that would have plenty of disk space for my NAS and be able to run any operating system and software that I wanted to try out.

I called this my One Server project.  The next several posts will cover this project.  These posts describe the hardware behind the server, using VMware ESXi to replace my aging Linux VirtualBox server, setting up a FreeBSD ZFS NAS fileserver under VMware ESXi, and all the issues and solutions I discovered along the way.

  1. One Server: What is needed?
  2. One Server: Researching the hardware
Share
Categories: Uncategorized Tags:

One Server: What is needed?

To make sure I got the right hardware and software for this server I needed to know what the server was going to be used for.  I needed to get an idea of how much computing power I was going to use to run all the virtual machines.  And since this project started off as an upgrade to my NAS fileserver I also needed to figure out how much storage space I was going to need.

I knew from my previous VirtualBox server what guest operating systems I was going to run.  They were:

  • Astaro Security Gateway for the firewall/router
  • Windows 7 for a “standalone” computer used only for banking
  • Linux for a web server
  • Linux for a CrashPlan backup server
  • Linux for a OpenVPN server

My previous fileserver had four disks in a RAID-5 setup for a total of three terabytes.  It was very slow and I wanted to find a way to speed it up.  At the same time I needed to add enough disk space so that I wouldn’t have to think about disk space for a long time.  I previously used the file server for:

  • storing backup copies of my iTunes music and video libraries
  • keeping copies of operating system ISO install images for installing VMs
  • backing up my wife’s and my own laptop as well as my web server
  • storing an ever growing 500Gb RAW photos library from my DSLR camera
  • storing video for my MythTV setup

I had also recently gotten a new camera capable of recording HD video.  HD video files take up a lot of space and with a new baby daughter I was recording a lot of video.

Doing the math, I decided I needed at least 8 terabytes of storage to comfortably cover my needs.  To make sure I wouldn’t have to worry about storage space again, and considering Moore’s law, I decided to double that and plan for 16 TB of storage space.

I had the following additional requirements of the server itself

  1. be reliable enough to run 24×7 for several years
  2. continue working without data loss if two hard drives fail
  3. allow for hard drives to be replaced without shutting down the system
  4. be easy to backup
  5. report any errors with the drives or the virtual machines so they can be fixed quickly
  6. be compatible with as many guest operating systems as possible
  7. be easy to install, maintain, and configure.  Well, easy for a technical person at least
  8. allow for remote maintenance of the host operating system
  9. have room for expansion (cpu/ram/disk/etc upgrades)
  10. be quiet
  11. not use too much electricity
  12. support multiple network interfaces. It is going to be my router and needs to plug into my cable/dsl modem as well as my LAN

The reliability of the server was my most important factor.  Since I was consolidating everything on this one server, if it ever went down nothing would work.  It was also going to store all our family photos and videos.  I planned to keep everything backed up, but I wanted to make sure I wasn’t going to lose those memories due to a failed disk or silent bit rot.

Share
Categories: Uncategorized Tags:

IntenseDebate and Google Analytics

I use IntenseDebate for the comment system on my blog. I also use Google Analytics to keep stats on how many people visit my site. To integrate the two, I created a Google Analytics plugin for IntenseDebate.  With this plugin, when someone leaves a comment, an event is added in Google Analytics.  This event can then be used with advanced segments in Google Analytics to see metrics focusing just on visits that lead to comments.

To see the IntenseDebate events in Google Analytics, browse to Content -> Event Tracking -> Categories ->IntenseDebate

The plugin is not currently in the approved list of IntenseDebate plugins.  I’d like to do more testing before submitting it.  If you’d like to help test, you can download my IntenseDebate Google Analytics plugin and add it to your IntenseDebate custom scripts.

The plugin can be customized to create a virtual page view if you’d like to create a goal based on comments.  To enable this, add the following to your IntenseDebate custom scripts.

var id_ganalytics_plugin = id_ganalytics_plugin || {};
id_ganalytics_plugin.use_vpage = true;

With this enabled, a page view for /service/IntenseDebate/CommentPosted will be created each time a comment is posted.

The plugin can further be customized to change the event that is tracked and to change the virtual page.  The following options control the event.  See the event tracking overview page for documentation on category, action, and label.

var id_ganalytics_plugin = id_ganalytics_plugin || {};
// Use id_ganalytics_plugin.use_event to enable/disable event based tracking
id_ganalytics_plugin.use_event = true;
 
// Use id_ganalytics_plugin.event_category to set the event category
id_ganalytics_plugin.event_category = 'IntenseDebate';
 
// Use id_ganalytics_plugin.event_action to set the event action
id_ganalytics_plugin.event_action = 'Comment Posted';
 
// Use id_ganalytics_plugin.event_label to set the event label
id_ganalytics_plugin.event_label = location.href

The following two options control the virtual page views

var id_ganalytics_plugin = id_ganalytics_plugin || {};
 
// Use id_ganalytics_plugin.use_vpage to enable/disable virtual page tracking
 id_ganalytics_plugin.use_vpage = false;
 
// Use id_ganalytics_plugin.vpage to set the virtual page to be tracked
id_ganalytics_plugin.vpage = '/service/IntenseDebate/CommentPosted';

Leave me a comment if you find this useful, or if there is anything you’d like to see chaged.

Share
Categories: Tutorial Tags:

Hiding Google Analytics Campaign Variables

Do you use a service like Google Analytics for viewing your website statistics? Are you keeping track of your inbound links using campaign variables (utm_source, utm_medium, utm_campaign)? I recently ran into a situation where Google search results were linking to URLs with my campaign variables in them. Not a good thing – it really messes up your stats by reporting Google searches as coming from another source! Not to mention causing duplicate copies of your content to appear in the search listings.

Thankfully there is a quick fix for Google. Setting the canonical header link will cause Google to re-evaluate the URL next time your site is indexed. But what about a user copy-and-pasting a link to another site, or bookmarking that link?

It turns out Google Analytics can parse campaign URLs in two different ways.  It can parse them in the query parameters (those variables that come after the ‘?’ in your URLs). Or, it can also parse them when stored in the fragment after the ‘#’ in your URL. Google provides an API function to enable parsing of the fragment parameters.  The function is _setAllowHash(true).  You insert this just before the call to _trackPageview.

pageTracker._setAllowAnchor(true);
pageTracker._trackPageview();

In theory, this should work well. Google is not supposed to index the fragment parameters that come after a URL. But what if a user bookmarks the URL?  Or what if they copy-and-paste the URL to digg or another site? This still isn’t going to solve the problem.

Time to rethink. Ideally, the campaign variables would only be available to Google Analytics and not even show in the client’s URL bar. Then they cannot be indexed by search engines and it would be unlikely they’d be copy-and-pasted to another site by the user. Here is a better solution.

When campaign variables are passed to a web page, the PHP page that is loaded can look at the $_GET parameters and detect those variables. It can then remove them, stick them in a session cookie, and redirect the user on to the correct URL – the one without the campaign variables.

The fragment portion of the URL, the part after the ‘#’, can be modified by Javascript. When the redirected page loads, before the Google Analytics code is called, a bit of Javascript code can be used to pull the campaign variables out of the cookie and place them into the fragment portion of the URL. After the Google Analytics code runs, these campaign variables can be removed, the cookie can be deleted, and the original fragment (if there was one) can be restored.

So, putting it all together, here is what happens.

  1. User clicks on a link with campaign variables and visits your website with a URL that looks something like: http://example.com/page/?utm_source=youtube&…
  2. The PHP code running your website detects that the user has clicked on a link with campaign variables, stores those variables in a cookie, and redirects the user to that same URL but without the campaign variables.
  3. The user’s browser visits the new page, without the campaign variables, and passes the cookie along to that page. The new URL looks something like this: http://example.com/page/
  4. The page loads in the user’s browser. As it loads a bit of Javascript runs.  The Javascript adds the campaign variables to the fragment portion of the URL.  At this snapshot in time, the URL looks like this: http://example.com/page/#utm_source=youtube&…
  5. Immediately after the URL is rewritten, the Google Analytics page tracker code runs and credits the source to the intended campaign. Immediately afterward, the custom Javascript erases the variables from the fragment so that the user never sees them, putting the URL back to: http://example.com/page/#

That’s it. Google Analytics gets the proper information to keep track of your campaigns, the user doesn’t see a cluttered URL, and Google doesn’t get a chance to index the page with the campaign variables in the query string.

There is one bug annoyance.  That is, after the Google Analytics page tracker runs and the fragment is erased, it still leaves the single ‘#’ character in the URL.  But at least this won’t cause any harm if the user bookmarks it or copy-and-pastes it somewhere.  If anyone has ideas on how to get rid of this, please leave me feedback in the comments.

Now, if only this could be incorporated into Joost de Valk’s wonderful Google Analytics for WordPress plugin! I’ve modified my copy to do this already. See the attached Google Analytics for WordPress Modifications. This isn’t a complete plugin, only a modification to the source file for version 2.9.5 of the official plugin.
[Update: See the comments below. Adding this to Google Analytics for WordPress might not be that useful]

Lastly, the code :)  Here is the bit of PHP code the detects the Google Analytics variables, sets the cookie, and redirects the user to a “clean” URL. If you have run into a similar situation and solved it a different/better way, please leave a comment and let me know what you did. I’m very interested in knowing if this could be done a better way!

// Add any Google Analytics Campaign variables to the found_tags array.
// Remove them from the _GET array so they don't get forwarded on
$found_tags = array();
foreach(array('utm_source', 'utm_medium', 'utm_campaign', 'utm_term', 'utm_content') as $tag) {
    if(isset($_GET[$tag]) &amp;&amp; !empty($_GET[$tag])) {
        $found_tags[$tag] = $_GET[$tag];
        unset($_GET[$tag]);
    }
}
 
// If any campaign variables were found, redirect the user to the "clean" URL
// after setting the 'gatmp' session cookie with the campaign variables.
if(count($found_tags) &gt; 0) {
    setcookie('gatmp', http_build_query($found_tags));
    $dest = $_SERVER['SCRIPT_URI'];
    if( count($_GET) &gt; 0 ) {
        $dest .= '?'.http_build_query($_GET);
    }
    header ('HTTP/1.1 301 Moved Permanently');
    header ('Location: '.$dest);
    exit(0);
}

Next, here is the javascript that detects the cookie and passes the campaign variables on to Google Analytics. This code takes the place of the normal pageTracker._trackPageview() function call.

function gaTrackerClass() {
  this.cookieVal = false;
 
  // Grab the cookie, if it exists, store in this.cookieVal
  if (typeof(document.cookie) != "undefined" &amp;&amp; document.cookie.length &gt; 0) {
    c_name = 'gatmp'; // Cookie name
    var c_start=document.cookie.indexOf(c_name + "=");
    if (c_start!=-1) {
      var v_start=c_start + c_name.length+1;
      var v_end=document.cookie.indexOf(";",v_start);
      if (v_end==-1) v_end=document.cookie.length;
      this.cookieVal = unescape(document.cookie.substring(v_start,v_end));
      // Unset the cookie so it doesn't get used multiple times
      document.cookie = c_name + "=; expires=Thu, 01-Jan-1970 00:00:10 GMT";
    }
  }
 
  // Our _trackPageview function. It emulates the behavior of the Google
  // function, using the cookie rather than query parameters in the URL.
  // If no cookie is found, just call the normal _trackPageview function
  this._trackPageview = function(str) {
    if( typeof(pageTracker) != "undefined" ) {
      if(this.cookieVal != false &amp;&amp; typeof(window.location) != "undefined") {
        // Save the current fragment
        var hashtmp = window.location.hash;
 
        // Call Google Analytics and record the campaign variables
        window.location.hash = '#' + this.cookieVal;
        pageTracker._setAllowHash(true);
        pageTracker._trackPageview(str);
 
        // Restore the fragment to its original value
        window.location.hash = hashtmp;
      } else {
        pageTracker._trackPageView(str);
      }
    }
  }
}
var gaTracker = new gaTrackerClass();
gaTracker._trackPageview();
Share
Categories: Tutorial Tags: campaign, Google Analytics