<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments on: Counting items with mysql and regex in the group by statement</title>
	<atom:link href="http://www.cruzinthegalaxie.com/counting-items-with-mysql-and-regex-in-the-group-by-statement/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.cruzinthegalaxie.com/counting-items-with-mysql-and-regex-in-the-group-by-statement/</link>
	<description>Click on a tag...</description>
	<pubDate>Fri, 12 Mar 2010 07:47:31 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: David Woods</title>
		<link>http://www.cruzinthegalaxie.com/counting-items-with-mysql-and-regex-in-the-group-by-statement/comment-page-1/#comment-4792</link>
		<dc:creator>David Woods</dc:creator>
		<pubDate>Sat, 23 May 2009 03:09:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.cruzinthegalaxie.com/?p=162#comment-4792</guid>
		<description>That's slick, I love using REGEX pretty much wherever I can.

For gleaning info from tables that you don't have control of, that's a good way of doing things.

If you have control over the table, and this is going to be a frequently run piece of SQL, then you'd take some serious load off of the server by adding a new data field that will contain either nothing or, in the case of website that you have verified match the regex in the scripting language, the landing page id you strip from the url with the script language.

Has MySQL implemented triggers yet? I've never used them, but I would imagine that processing and saving that extra field would be a good place to use them. I should check out what's come out of the latest MySQL updates...</description>
		<content:encoded><![CDATA[<p>That&#8217;s slick, I love using REGEX pretty much wherever I can.</p>
<p>For gleaning info from tables that you don&#8217;t have control of, that&#8217;s a good way of doing things.</p>
<p>If you have control over the table, and this is going to be a frequently run piece of SQL, then you&#8217;d take some serious load off of the server by adding a new data field that will contain either nothing or, in the case of website that you have verified match the regex in the scripting language, the landing page id you strip from the url with the script language.</p>
<p>Has MySQL implemented triggers yet? I&#8217;ve never used them, but I would imagine that processing and saving that extra field would be a good place to use them. I should check out what&#8217;s come out of the latest MySQL updates&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex Barger</title>
		<link>http://www.cruzinthegalaxie.com/counting-items-with-mysql-and-regex-in-the-group-by-statement/comment-page-1/#comment-4752</link>
		<dc:creator>Alex Barger</dc:creator>
		<pubDate>Fri, 22 May 2009 01:26:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.cruzinthegalaxie.com/?p=162#comment-4752</guid>
		<description>Awesome stuff man! You are learning a lot at that company you are at now. I miss the old days of white boarding these concepts out and really optimizing the hell out of the process!</description>
		<content:encoded><![CDATA[<p>Awesome stuff man! You are learning a lot at that company you are at now. I miss the old days of white boarding these concepts out and really optimizing the hell out of the process!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jim D</title>
		<link>http://www.cruzinthegalaxie.com/counting-items-with-mysql-and-regex-in-the-group-by-statement/comment-page-1/#comment-4748</link>
		<dc:creator>Jim D</dc:creator>
		<pubDate>Thu, 21 May 2009 23:43:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.cruzinthegalaxie.com/?p=162#comment-4748</guid>
		<description>Regular expressions are cool and all, but man... that is nightmare for the query optimizer.  If you do an EXPLAIN on that bad boy, you'll see what I mean.

If you have the ability, I'd highly recommend adding an extra field to that table that stores the referrer as either a numeric id that maps to a separate referrer table, or an enum that you can tack onto later.  Do the regular expression magic in PHP and store the value during INSERT.  MUCH faster in the long run.

If you take the enum route (google, yahoo, other), your query can simplify to:

  SELECT signup_date, COUNT(landing_page) AS signup_count, referrer
    FROM signup
   WHERE (signup_date BETWEEN :start_date AND :end_date)
GROUP BY referrer, signup_date
ORDER BY NULL

If signup_date is not already a DATE type, I would add another column that just stores the date portion of the timestamp.  It makes the index much more efficient.  (You ARE using indices, right?  :-))

ALTER IGNORE TABLE `signup` ADD INDEX `referrer` (`referrer`, `signup_date`);</description>
		<content:encoded><![CDATA[<p>Regular expressions are cool and all, but man&#8230; that is nightmare for the query optimizer.  If you do an EXPLAIN on that bad boy, you&#8217;ll see what I mean.</p>
<p>If you have the ability, I&#8217;d highly recommend adding an extra field to that table that stores the referrer as either a numeric id that maps to a separate referrer table, or an enum that you can tack onto later.  Do the regular expression magic in PHP and store the value during INSERT.  MUCH faster in the long run.</p>
<p>If you take the enum route (google, yahoo, other), your query can simplify to:</p>
<p>  SELECT signup_date, COUNT(landing_page) AS signup_count, referrer<br />
    FROM signup<br />
   WHERE (signup_date BETWEEN :start_date AND :end_date)<br />
GROUP BY referrer, signup_date<br />
ORDER BY NULL</p>
<p>If signup_date is not already a DATE type, I would add another column that just stores the date portion of the timestamp.  It makes the index much more efficient.  (You ARE using indices, right?  :-))</p>
<p>ALTER IGNORE TABLE `signup` ADD INDEX `referrer` (`referrer`, `signup_date`);</p>
]]></content:encoded>
	</item>
</channel>
</rss>
