Combining The Power Of Relational And Key=Value Style Under One Store
So, after populating my data in the previous article with 115,142 tweets the database grew to a size of 241692520 bytes (just 241MB). Now comes the fun part. I was totally blown away by what I can achieve by combining the power of relational and key value style under one store. So, for example I want to query all the tweets tweeted at Unix timestamp of 1353466014 (since I stored the timestamps as a string here is what my query looks like):
SELECT doc -> 'text' as tweet, doc -> 'created_at' as created_at
FROM my_store
WHERE doc @> 'created_at=>00001353466014';
I can add a simple count or any other well-known SQL aggregate function without complicating my data store with anything specific like map reduce or the hassle of learning a new language. Note that I padded my timestamp value with zeros since I am only storing strings as values. Also I am utilizing the @> operator, and that's going to use GiST to perform a quick bitmap index scan instead of a sequential scan.
That's pretty good for starters. Let's try to fetch out all the tweets that had hash tags in them:
SELECT doc -> 'text' as tweet, doc -> 'created_at' as created_at
FROM my_store
WHERE doc @> 'has_hashtags=>: t';
Yes querying the entire database and pulling out the complete data (that you won't probably do because you page the data) gives me 15105 rows in just under 205ms on average. Since we have SQL to hand let's make the query a little more complicated, and use a different operator for the same stuff and also sort the data by ‘created_at’:
SELECT doc -> 'text' as tweet, doc -> 'created_at' as created_at
FROM my_store
WHERE doc @> 'has_hashtags=>: t' AND doc ? 'has_urls'
ORDER BY doc -> 'created_at' DESC;
This sounds tasty! But there's more: Postgres has other operators. For example, you can also pull out hash tagged tweets with urls or mentions:
SELECT doc -> 'text' as tweet, doc -> 'created_at' as created_at
FROM my_store
WHERE doc @> 'has_hashtags=>: t' AND doc ?|
ARRAY[ 'has_urls' , 'has_mentions' ]
This is not all you can do with it! hstore comes with all sorts of operators and index systems that you can use to query hash stores. Check them out in the documentation. Now, despite the NoSQL boom I think we have some great examples and reasons of why RDBMS still remains a core part of many market giants (Facebook being something everyone knows).
Postgres just gives me one more reason to not ignore RDBMS. So, if you have been moving to some document stores just for the reason that RDBMS doesn't' provide them; think again! You can get the same rock solid durability with structure-free systems.
Defining Your Database Schema: A Guide to Different Methods
Learn about SQL, ORM, GUI, and declarative schema definition languages like YAML and XML...
Streamline Your Database Schema Management with Schema Guard
Discover how Schema Guard can help developers efficiently manage schema changes across multiple environments, from development to production deployment...
PostgreSQL is More Than Just a Boring Database!
The Five funniest possible usage scenarios for the Postgres. With respect to Python generation ;)...
The Key=Value Store That Nobody Used
Hstore is a key-value data type module in PostgreSQL, allowing users to store and query data in a flexible, unstructured format within a single column...