Lesson 23. Working With JSON

JSON is everything that we thought XML was going to be. FIGHT ME!! Back around 2005 when pterodactyls were still delivering the mail, this thing called XML hit the scene. Extensible Markup Language. Sadly, XML failed to deliver on its promises.

Fortunately, along came this thing called JavaScript Object Notation or JSON. JSON is a flexible file format. The key difference between JSON and CSVs is that JSON records can be of any length unlike CSVs where every row has the same number of columns. JSON records can also be nested. All of that means is you can't work with JSON and CSVs the same way without some transformation work. Fortunately, Pandas gives us those tools.

JSON Syntax

{
  "firstName": "Bob",
  "lastName": "Wakefield",
  "isAlive": true,
  "age": 44,
  "address": {
    "streetAddress": "12345 Street",
    "city": "Beverly Hills",
    "state": "CA",
    "postalCode": "90210"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "312.867.5309"
    },
    {
      "type": "office",
      "number": "312.222.2222"
    }
  ],
  "children": [],
  "spouse": null
  "life": "sad"
}

Examples

Example #1: Pulling A JSON File Into A DataFrame

Once you get it in a dataframe, it's smooth sailing from there.

In [4]:
import pandas as pd
df = pd.read_json('https://query.data.world/s/ohb5444qzgzjcm5pdvcyflcsrsedi5')
df
Out[4]:
time summary icon precipIntensity precipProbability temperature apparentTemperature dewPoint humidity pressure windSpeed windGust windBearing cloudCover uvIndex visibility precipType ozone
0 1530428400 Clear clear-night 0.0000 0.00 65.45 65.45 58.64 0.79 1014.5 4.23 4.23 243.0 0.25 0.0 9.798 NaN NaN
1 1530432000 Overcast cloudy 0.0000 0.00 65.16 65.16 58.62 0.79 1014.4 3.95 3.95 193.0 0.88 0.0 9.777 NaN NaN
2 1530435600 Overcast cloudy 0.0000 0.00 64.58 64.58 58.23 0.80 1014.1 4.21 4.21 185.0 0.92 0.0 9.778 NaN NaN
3 1530439200 Overcast cloudy 0.0000 0.00 64.46 64.46 57.87 0.79 1013.9 4.01 4.04 175.0 0.99 0.0 9.782 NaN NaN
4 1530442800 Overcast cloudy 0.0000 0.00 64.19 64.19 57.96 0.80 1014.1 3.81 3.81 182.0 1.00 0.0 9.108 NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
16569 1590112800 Clear clear-day 0.0083 0.01 78.69 78.69 48.05 0.34 1010.7 4.62 8.96 239.0 0.00 0.0 10.000 rain 316.3
16570 1590116400 Clear clear-night 0.0000 0.00 74.25 74.25 49.05 0.41 1010.9 2.41 7.73 250.0 0.01 0.0 10.000 NaN 317.7
16571 1590120000 Clear clear-night 0.0000 0.00 70.94 70.94 48.91 0.46 1011.1 2.58 6.56 294.0 0.01 0.0 10.000 NaN 317.5
16572 1590123600 Clear clear-night 0.0000 0.00 68.82 68.82 48.74 0.49 1011.8 2.68 5.03 246.0 0.02 0.0 10.000 NaN 316.3
16573 1590127200 Clear clear-night 0.0000 0.00 67.09 67.09 49.13 0.52 1012.1 3.02 6.72 155.0 0.03 0.0 10.000 NaN 315.7

16574 rows × 18 columns

Copyright © 2020, Mass Street Analytics, LLC. All Rights Reserved.