JSON Strings and Python Objects  for Data Wrangling: A Beginner's Guide (Part 1)

JSON Strings and Python Objects for Data Wrangling: A Beginner's Guide (Part 1)

Ever come across a weird-looking bunch of curly braces and quotation marks holding all sorts of information? That, my friends is JSON (JavaScript Object Notation), a super common way to store and share data these days. It's kind of like a secret code for computers to talk to each other, but worry not, we're about to crack that code together. Whether you're a Python newbie or just getting familiar with data wrangling, this tutorial is for you. We'll be venturing into the exciting world of JSON, and by the end, you'll be a JSON wrangler extraordinaire!

Here's what we'll be conquering today:

  • Understanding the magic behind JSON (structure, benefits).
  • Transforming mysterious JSON strings into Python objects we can play with.
  • Taking our Python objects and turning them back into JSON for storage or sharing.
  • Writing Python code to make this data transformation happen like a boss.

So, grab your favorite coding hat (or cat, no judgment here ) and let's get started!

Unveiling the Mystery: What's the Deal with JSON?

Imagine you have a secret message for your friend, but instead of weird symbols, you use notes with words and phrases. That's kind of what JSON (JavaScript Object Notation) is like! It's a way for computers to exchange information in a format that's easy for both humans and machines to understand.

JSON started out as a way for web browsers and servers to chat, but it's become a superstar for all sorts of data exchange. Here's why it's so awesome:

  • Readable: Unlike some code that looks like gibberish, JSON uses curly braces, square brackets, and quotes, making it almost like peeking into a conversation between computers. You can actually kind of guess what the data means without needing a decoder ring!
  • Lightweight: JSON is like a data dieter. It keeps things simple and to the point, making it perfect for sending information over the internet without weighing things down.
  • Language Lover: JSON doesn't care what programming language you speak. It's a universal translator that different systems can understand, making data exchange a breeze.
  • Data Do-It-All: JSON can handle all sorts of information, from text and numbers to lists and even more complex stuff. It's like a data Swiss Army knife, ready for any task!

Let's see JSON in action! Imagine you're writing a blog post and want to store some information about it. Here's what that might look like in JSON:

Article content

Pretty cool, right? We can see the title, author, content, and even some tags, all neatly organized in this JSON code. That's the power of JSON – clear, compact, and ready to use!

JSON Ground Rules: Building Your Data Castle

Before we jump into wrangling JSON data with Python, let's establish some ground rules, like building blocks for your JSON castle! These rules will ensure your data is structured correctly and understood by both humans and computers.

  1. Keys and Values: Imagine keys as labels and values as the information they describe. In our blog post example, "title" is the key, and "Top 5 Python Tips for Beginners" is the value. Keys are always enclosed in double quotes ("), while values can be strings, numbers, booleans (true/false), arrays (ordered lists), or even nested objects (more complex data structures).
  2. Colons and Commas: Think of colons (:) as little bridges connecting keys and their values. Commas (,) separate key-value pairs, keeping things organized within the curly braces ({}).
  3. Order Doesn't Matter: Unlike some lines of code where order is crucial, the order of key-value pairs in a JSON object doesn't strictly matter. The computer will still understand the information as long as the keys and values are properly paired.
  4. Double Quotes are Kings (and probably Queens): When it comes to naming your keys (those labels), double quotes (") are mandatory. Single quotes (') won't work here, so keep those quotation marks handy for your key names.
  5. Whitespace is Welcome (But Not Necessary): Spaces, tabs, and new lines can be used to improve readability, making your JSON string look more presentable. While not strictly required, proper indentation can make your JSON code much easier for humans (like me and you!) to understand.
  6. Arrays and nested Objects: an array is an ordered list of values enclosed in square brackets []. Imagine an array as a box holding a collection of related items. These items can be of any valid JSON data type, including other arrays or objects.  Let's think of nested objects as rooms within your JSON castle. Objects (wrapped in curly braces {}) can contain key-value pairs, and those values can themselves be other objects! This allows for complex data structures to represent real-world information effectively. The beauty of JSON is its flexibility! There's no strict limit on the number of arrays you can have within a JSON string. You can nest arrays within other arrays, creating a hierarchy of data as needed. However, keep in mind that excessively nested structures can become difficult to read and maintain. Aim for clarity and structure when building your JSON data.

Here's an example showcasing arrays and nested objects:

Article content

In this example, we have an object with an "author" key containing another object with nested "name" and "experience" information. Additionally, the "topics" key holds an array with strings and a nested object containing a "subtopic" with its own array of "frameworks."

The table below shows how conversion to and from Json will be treated and I will leave that for you to verify and have a taste of being part of the designing of this tutorial. Meet me in the next section once you are done quenching your curiosity.

Article content

The weapon of our warfare is not carnal but the json Module!

Now that we've cracked the JSON code and understood its structure, it's time to bring that data into python! Buckle up, because we're about to perform some data wrangling magics. Python provides a built-in tool, the json module specifically designed for working with JSON data. It's like having a decoder ring that lets you translate between the JSON code and python objects. To use this magic tool, we first need to import it into our python code

In the context of the json module, these are two important functions:

  • dumps(serialize): This function takes a Python object as input and returns a JSON string representation of that object. This is useful for saving data in a JSON format or for transmitting data between applications.
  • loads(deserialize): This function takes a JSON string as input and returns a Python object representation of that data. This is useful for loading data that has been stored in a JSON format.

These two functions are essential for working with JSON data in Python. They allow you to convert between Python objects and JSON strings, which makes it easy to store, transmit, and work with data in a format that is widely used by many different applications and systems.

Here is an example of how to use the dumps() and loads() functions:

Article content
Article content

In this example:

  1. I defined a list of posts containing dictionaries representing individual posts.
  2. I used json.dumps(posts) to convert the list of posts into a JSON string. This string can now be stored in a file or transmitted elsewhere, however I chose to print it for the sake of this learning. 
  3. I use json.loads(json_string) to convert the JSON string back into a Python list of dictionaries (representing posts). This allows us to work with the post data within our Python program.
  4. I also printed the data type of posts before serialization, after serialization and after deserialization. If you have been following along, you should have an output that looks like the one in the code snippet above. 

Things are about to get even more interesting!

I have a snippet for us to look into.

Article content

Looks like we’ve run out of luck here, our output is a disappointment.

Article content

Our object has been created but things went bad on line 20 in our attempt to serialize post. A TypeError is raised and the message says post is not JSON serializable. In all honesty my friends, you cannot dump the content of an object even if it is as simple as this example. What’s the hack? Well, we can alternatively dump post.__dict__ but loading it will return a dict. The snippet below says it clearly!

Article content


Article content

Lemme show you the way out, we’re gonna redirect “default()” - the function dumps() uses to obtain a textual representation of its argument and “object_hook()”  - the function loads() uses to convert json elements to python object, have a look!

Article content


Article content

All I have done is to write my own functions that understand our object and point default and object_hook to them. There is yet another way of achieving this same result that involves writing our own custom classes that inherit from json.JSONDecoder and json.JSONEncoder and twigging them to achieve our desired output. Since this post is becoming a little longer, I will be leaving out that part. 

Conclusion:

We've conquered the basics of transforming JSON strings into Python objects and vice versa using the built-in json module! Now you have the power to unlock and manipulate data stored in JSON format.

But what if you encounter data that's not quite JSON-friendly? Fear not, Python warriors! In the next part of this series, we'll delve into the world of custom JSON encoders and decoders classes, handling exceptions that might arise in the process, tackling more demanding examples -  empowering you to handle even the trickiest data types.

Did you find this post helpful? Let me know in the comments below!

Do you have any questions about working with JSON data in Python?

Share this post with your fellow Python enthusiasts who might be curious to crack the JSON code!

Stay tuned for the next part, where we'll conquer custom JSON encoder and decoders classes together!

Aderolake Oyegbata

Transforming Businesses and Lives Through Creative Writing, Media, Strategic Digital Marketing, Talent Acquisition, Customer Management, and AI.

7mo

Great tutorial! You broke down JSON and its integration with Python in a fun and approachable way.

Like
Reply

Wow. Well done.

Festus Kalu

Hire me! & save 30+ hours weekly | I will clear 100% of routine tasks in 6 hours & improve your daily FOCUS! | Upwork Top Rated CEO Executive Assistant | Phone, Email & Chats Assistance

9mo

Useful tips here MICHAEL OKO 🔥🔥⭐

Peace Ododo

Data Scientist | GPSDD CAN Fellow

9mo

Scholar Michael!! Great insightful post👍🏾

Adetunji Theophilus Agbayewa

Senior Software Engineer | Technical Support Engineer | Data Scientist

9mo

Amazing. This is really handy, I will bookmark this.

To view or add a comment, sign in

More articles by MICHAEL OKO

Insights from the community

Others also viewed

Explore topics