Alexander Conroy

mySQL Tidbits: GROUP_CONCAT and ON DUPLICATE KEY UPDATE Examples

Mysql Tidbits:

GROUP_CONCAT

You can GROUP_CONCAT multiple result sets from multiple LEFT JOIN’s to create a single result set when using a primary table and relational “meta” table by using DISTINCT.

In order to get multiple result sets from this type of action you need to GROUP BY your results.

The following example is pretty complex. It takes 3 tables, combines multiple (left joined) rows from table2, gets the relevant (left joined) row from table 3 based on table2, then comma separates the values per ‘key’ value queried in table2 as a single column in table1 along with table1’s result set, showing only unique values.

Example:

  • SELECT column1,
    • column2,
    • column3,
    • column4,
    • GROUP_CONCAT( DISTINCT group_definition.value ) as 'definitions', 
    • GROUP_CONCAT( DISTINCT group.value ) as 'ids'
    • FROM table1
    • LEFT JOIN table2 AS group,
      • ON table1.user = group.user
      • AND group.key = 'some_key'
      • AND group.switch = 1
    • LEFT JOIN table3 AS group_definition,
      • ON group.value = group_definition.id
    • GROUP BY table1.id

This is an example of a query that accesses 3 Tables. Table1 is the primary table, which has unique information row by row by id (primary key). Table2 is a relational table, which is tied to table1 by the user column. Table2 uses a standardized but non unique ‘key’ column with  ‘value’ column unique to the ‘key’ and ‘user’. Table2 has a unique index on ‘user’, ‘key’ and ‘value’. table2 uses ‘switch’ as a boolean to decide whether or not the record is active or not. table3 holds the definition of table2’s ‘value’ as its own ‘value’ based off table3’s primary key ‘id’. The entire operation is grouped by table1’s primary key ‘id’, otherwise we do not get more than 1 result set.

The GROUP_CONCAT will collapse the result set that would normally be generated when finding multiple values in table2 related to table1, and instead place them as comma separated values which is MUCH more handy to deal with. We can use PHP or some other language to explode the ids and the definitions (ids, and definitions). This also allows us to query the database by table2 and table3 because they are included via LEFT JOIN, giving us exceptional flexibility with this query.

With this I can find a user from table1 that has a key, value and switch combination in table2, and return both the value as id number AND the definition value of table3 from those values, all in one row. It also allows me to pull multiple records unique only to table1, preventing duplicate data.

I know this may sound confusing, but it really is. I only hope my example will shed some light on a very complex query that can make your database ultra fast, sharp and flexible. I found many examples of this around the net but never at the complexity that I ended up with. 

ON DUPLICATE KEY UPDATE

You can use the INSERT INTO with ON DUPLICATE KEY UPDATE and VALUES() function to insert data in a database, or insert it, if a unique or primary key would be violated. It is a good idea to use a properly defined multi column unique index with this method. 

For example, lets say that user, key, and value are unique, and the table has a primary key of id. If we index user,key and value together, we can use this statement:

  • INSERT INTO table
    • ( user, key, value, switch ) 
    • VALUES 
      • (row1val1, row1val2, row1val3, row1val4), 
      • (row2val1, row2val2, row2val3, row2val4),
      • (row3val1, row3val2, row3val3, row3val4), 
      • (row4val1, row4val2, row4val3, row4val4),
    • ON DUPLICATE KEY 
      • UPDATE switch=VALUES(switch);

This would update the value that would have gone in as row1val4, row2val4, row3val4, row4val4 of that record if val1, val2 and val3 existed respectively, otherwise it would insert a new record and auto increment the id.

This should be helpful when trying to create database queries that add and update information, without having to go through the extra step of querying the database to see if the data already exists…shaved an extra step off my programming process, and a TON of queries by doing so!