« Previous | Next »

SQLite and JSON

24 Apr 2016

SQLite 3.9.0, released in Oct 2015, introduced the json1 extension that provides functions to manage JSON content stored in an SQLite database. The extension source code is included in the SQLite source amalgamation and is enabled by the -DSQLITE_ENABLE_JSON1 option.

The following assumes that the json1 functionality is available. As sample data, I use the stocks data set from jsonstudio.com. Here's what the wrapped first line of the JSON data file looks like partially:

{ "_id" : { "$oid" : "52853800bb1177ca391c17ff" }, "Ticker" : "A", "Profit
Margin" : 0.137, "Institutional Ownership" : 0.847, "EPS growth past 5
years" : 0.158, "Total Debt/Equity" : 0.5600000000000001, "Current Ratio" :
3, "Return on Assets" : 0.089, "Sector" : "Healthcare", "P/S" : 2.54,
"Change from Open" : -0.0148, "Performance (YTD)" : 0.2605, "Performance
(Week)" : 0.0031, "Quick Ratio" : 2.3, "Insider Transactions" : -0.1352,

To load the file into an SQLite database using NBSQLite3:

| jr db |
jr := (FileSystem / 'tmp' / 'stocks.json') asFileReference readStream.
jr ascii.
db := NBSQLite3Connection openOn: '/tmp/s.db'.
[   db execute: 'create table s (d json)'.
    db execute: 'begin'.
    [ jr atEnd ] whileFalse: [ 
        db execute: 'insert into s values (json(?))' "<== Note."
            with: (Array with: jr nextLine) ] 
    db execute: 'commit'.
] ensure: [ db close ]
jr close.

Note the invocation of the json() SQL function, which verifies that its argument is a valid JSON string and returns a minified version of that JSON string.

Let's inspect the created database with the sqlite3 shell tool:

$ wc -l stocks.json
6756 stocks.json
$ sqlite3 s.db
SQLite version 3.9.2 2015-11-02 18:31:45
Enter ".help" for usage hints.
sqlite> select count(*) from s;
6756
sqlite> .schema
CREATE TABLE s (d json);
sqlite> 

The json_extract(X, P1, P2, ...) function extracts and returns one or more values from the JSON structure X:

sqlite> select json_extract(d, '$.Sector', '$.Company') 
   ...> from s 
   ...> where json_extract(d, '$.Ticker') = 'AAPL';
["Consumer Goods","Apple Inc."]
sqlite> 

The json_each() table-valued function walks the JSON value provided as its argument and returns one row for each element. The schema for the table returned by json_each() is as follows:

CREATE TABLE json_tree(
    key ANY,             -- key for current element relative to its parent
    value ANY,           -- value for the current element
    type TEXT,           -- 'object','array','string','integer', etc.
    atom ANY,            -- value for primitive types, null for array & object
    id INTEGER           -- integer ID for this element
    parent INTEGER,      -- integer ID for the parent of this element
    fullkey TEXT,        -- full path describing the current element
    path TEXT,           -- path to the container of the current row
    json JSON HIDDEN,    -- 1st input parameter: the raw JSON
    root TEXT HIDDEN     -- 2nd input parameter: the PATH at which to start
);

Here's json_each() in action:

sqlite> select key, value 
   ...> from s, json_each(s.d) 
   ...> where json_extract(d, '$.Ticker') = 'AAPL'
   ...> limit 10;
_id|{"$oid":"52853800bb1177ca391c1809"}
Ticker|AAPL
Profit Margin|0.217
Institutional Ownership|0.621
EPS growth past 5 years|0.424
Total Debt/Equity|0.14
Current Ratio|1.7
Return on Assets|0.186
Sector|Consumer Goods
P/S|2.75
sqlite> 

Let's run the same query in Pharo:

| q db v |
q := 'select key, value from s, json_each(s.d) where json_extract(d, "$.Ticker") = "AAPL" limit 10'.
db := NBSQLite3Connection openOn: '/tmp/s.db'.
[   (db execute: q) rows do: [ :r |
        Transcript show: (r at: 'key') asString, ' = '.
        v := r at: 'value'.
        Transcript show: v asString, ' (', v class asString, ')'; cr ]
] ensure: [ db close ]

Transcript displays thusly. As we can see, SQLite and NBSQLite3 handle numerical values in the JSON data properly.

_id = {"$oid":"52853800bb1177ca391c1809"} (ByteString)
Ticker = AAPL (ByteString)
Profit Margin = 0.217 (Float)
Institutional Ownership = 0.621 (Float)
EPS growth past 5 years = 0.424 (Float)
Total Debt/Equity = 0.14 (Float)
Current Ratio = 1.7 (Float)
Return on Assets = 0.186 (Float)
Sector = Consumer Goods (ByteString)
P/S = 2.75 (Float)

Notice that the first row is well-formed JSON. If we use json_tree() instead of json_row(), that JSON structure will be returned as a key/value row as well.

Finally, let's read the first JSON record using NeoJSON:

| q db  |
q := 'select json_extract(d, "$") as data from s limit 1'.
db := NBSQLite3Connection openOn: '/tmp/s.db'.
[   (NeoJSONReader on: ((db execute: q) rows first at: 'data') readStream)
        next explore
] ensure: [ db close ].

This is powerful stuff. Well worth a look.

Tags: NoSQL, SQLite