Python Archives | Justin Joyce

Reverse a string in Python

Justin — Thu, 08 Feb 2024 17:28:40 +0000

The simplest way is to use Python double colon slicing with a negative step value:

my_string = "abc123"
reversed_string = my_string[::-1]
print(reversed_string)
"321cba"

Python

You can also use the builtin reversed method, which will “Return a reverse iterator over the values of the given sequence” according to its official help(). Note that this returns an iterator, not a string, so you’ll have to do a bit more work to create the reversed string:

new_string = "" for char in reverse_iterator: new_string += char print(new_string) "321cba"" style="color:#24292e;display:none" aria-label="Copy" class="code-block-pro-copy-button">

my_string = "abc123"
reverse_iterator = reversed(my_string)
# 

new_string = ""
for char in reverse_iterator:
  new_string += char
  
print(new_string)
"321cba"

Python

You could also build a list from the reversed object and then .join it like so:

"".join(list(reversed_iterator)) # "321cba"" style="color:#24292e;display:none" aria-label="Copy" class="code-block-pro-copy-button">

my_string = "abc123"
reverse_iterator = reversed(my_string)
# 

"".join(list(reversed_iterator))
# "321cba"

Python

Both of these methods work equally well on lists.

The post Reverse a string in Python appeared first on Justin Joyce.

Replace a string in Python

Justin — Sun, 29 Oct 2023 15:21:22 +0000

To replace a string or substring in Python you have two main options:

Built in string.replace()
Python’s standard library re.sub()

Python’s String Replace Method

The string.replace method works roughly how you would expect it to:

my_string = "abc123"
new_string = my_string.replace("b","X")
print(new_string)
# "aXc123"

By default, string.replace will replace all occurrences of the string you want to replace. However, it accepts an optional integer third argument specifying how many times to replace:

my_string = "aaabbbccc"
my_string.replace("a", "X", 2)
# "XXabbbccc"

One important note: string.replace does not modify the existing string, it returns a new copy with the replacement performed.

Since string.replace returns a string, you can also chain replace operations:

# Chain as many replacements as you like
my_string = "aaabbbccc"
my_string.replace("a", "X", 2).replace("b", "").replace("c", "Z")
# "XXaZZZ"

However, with more complex replacements you might want to use regex instead.

Python Regex re.sub

Unless I’m doing a simple character swap or character removal (replace with empty string), I tend to use regex. If you can create the regex pattern, re.sub can probably use it to replace content for you.

Regex is a deep topic, and I actually wrote up a regex cheatsheet post, but here’s a simple re.sub example:

import re

my_string = "abc123"

# remove all digits
new_string = re.sub("\d", "", my_string)
print(new_string)
# "abc"

Like string.replace above, using re.sub does not modify the original string, it returns a new copy.

Here’s an example from my day job, comparing two XML files:

# Similar to an actual task I had to do at work recently
import re

# These two files were supposed to be the same, but the
# spacing and indentation made them hard to compare
with open("file_one.xml", "r") as file_one:
  xml_one = file_one.read()
  
# Don't forget your context managers!
with open("file_two.xml", "r") as file_two:
  xml_two = file_two.read()
  
tabs_or_newlines= "[\t\n]"

# substitute with empty string "" to remove tabs and newlines
xml_one_stripped = re.sub(tabs_or_newlines, "", xml_one)
xml_two_stripped = re.sub(tabs_or_newlines, "", xml_two)

# Without the weird tabs and line breaks, they should be the same
if xml_one_stripped == xml_two_stripped:
  print("they're the same")
else:
  print("not the same")

For more information, check out the helpful doc links below.

Helpful Links

String.replace – official Python docs
Python re.sub – official Python docs
Python context managers – Me!

The post Replace a string in Python appeared first on Justin Joyce.

JSON in Python

Justin — Sun, 13 Aug 2023 14:00:01 +0000

Working with JSON in Python relies on Python’s builtin json module. After you import json, here are the json module methods:

json.loads – deserialize a json string into the appropriate Python type(s)
json.dumps – serialize a Python “object” into a json string
json.load – deserialize json from a file
json.dump – serialize json into a file

Notice that the methods ending in “s” deal directly with strings, whereas the others deal with files. To disambiguate them, I call them “load string” and “dump string” (in my own head, at least).

json.loads (load string)

Use this when you need to deserialize a json string, like when handling a json API response:

import json

person_str = '{"name": "justin", "age": 100}'
person_dict = json.loads(person_str)

print(person_dict)
{"name": "justin", "age": 100}

Default value parsing

Python json.load and json.loads also provide some nice keyword argument hooks for loading json data: parse_float, parse_int, and parse_constant. Each of these is called on its specified data type during loading, and the result of the hook function is what comes out at the end. Let’s do a quick example.

Say you know that all the float values in a json payload need to be converted to Decimal type. We can do that easily with parse_float:

import json, decimal

person_str = '{"name": "justin", "dollars": 50.25}'

json.loads(person_str, parse_float=decimal.Decimal)
# {"name": "justin", "dollars": Decimal("50.25")}

The other parse_ methods work just like this. There’s even an object_hook you can call on the full decoded payload, to be used like a reducer. For more, see the official docs.

json.dumps (dump string)

Using the same person from above:

import json

person = {"name": "justin", "age": 100}
json.dumps(person)
'{"name": "justin", "age": 100}'

With pretty printing

For nicer formatting, you might want to pretty-print your json:

import json

person = {"name": "justin", "age": 100}

# use the indent kwarg for nicer formatting
print(json.dumps(person, indent=4))
{
    "name": "justin",
    "age": 100
}

You can also sort the keys of your json, which makes life easier when inspecting large json objects:

import json

person = {"name": "justin", "age": 100}

# use the sort_keys kwarg to sort object keys
print(json.dumps(person, indent=4, sort_keys=True))
{
	"age": 100,
	"name": "justin"
}

These pretty-printing and formatting kwargs work exactly the same in json.dump also.

json.dump (to file)

This works just like json.dumps, but instead of writing to a string it writes to a file:

import json

person = {"name": "justin", "age": 100}

with open("person_file.json", "w") as outfile:
	person = json.dump(person, outfile)

json.load (from file)

This works just like json.loads above, but instead of acting on a string it reads from a file:

import json

# read the file we just created above
with open("person_file.json", "r") as infile:
	person = json.load(infile)
    
print(person)
# {"name": "justin", "age": 100}

Type conversions

Converting Python into json is not 100% apples to apples. Tuples become arrays¹, None isn’t valid json, True and False aren’t capitalized, etc. Here’s the conversion table lifted directly from the Python json docs:

JSON	Python
object	dict
array	list
string	str
number (int)	int
number (real)	float
true	True
false	False
null	None

Errors when reading or writing json

If you try and load something that isn’t json, or isn’t properly encoded, you’ll see either a TypeError or a JSONDecodeError:

import json

# This is a dict, not JSON
json.loads({"name": "justin"})
# TypeError: the JSON object must be str, bytes or bytearray, not dict

# This string has an extra " in it, it's not properly encoded
json.loads('{"name": "justin""}')
# JSONDecodeError: Extra data: line 1 column 19 (char 18)

Gotcha: Decimal type

There is a small note in the official docs about “exotic” numerical types, like decimal.Decimal—they are not JSON serializable:

import json
import decimal

not_serializable = {"number": decimal.Decimal(100)}
json.dumps(not_serializable)
# TypeError: Object of type Decimal is not JSON serializable

You can use json.dumps default keyword argument to get around this issue, it provides a default encoding function to be used for data that could not be serialized. In the case of the Decimal above, we could pass float:

import json, decimal

not_serializable = {"number": decimal.Decimal(100)}
json.dumps(not_serializable, default=float)
'{"number": 100.0}'

Careful with default though, it will apply to all non-serializable Python types. If you tried to use float and there was a datetime object in your data, you’d get a new error².

json doesn’t have a tuple type, so Python tuples become json arrays ︎
This blog post has a clever default encoding solution using python f-strings ︎

The post JSON in Python appeared first on Justin Joyce.

Python sets

Justin — Sun, 30 Jul 2023 15:31:12 +0000

Sets are one of Python’s built-in types, and they’re very useful for deduplicating and comparing collections of data. Sets have tons of useful built-in functionality, and this post covers a lot.

Here are some jump links to make life easier:

Creating a set

There are a few options:

# Create an empty set
set_one = set()

# Create a set from an existing list
set_two = set([1, 2, 3])

# Create a set with single curly brackets
set_three = {1, 2, 3}

# If you use the single bracket method, you must pass
# elements to the set. Otherwise Python will create a dict
not_a_set = {}
type(not_a_set)
# dict

Check if a set contains a member

You can check for membership with classic Python in and not in:

my_set = set([1, 2, 3])

1 in my_set
# true

1 not in my_set
# false

Add members to a set

Add members one at a time

You can add individual members to a set via set.add():

my_set = {1, 2, 3}
my_set.add(4)
print(my_set)
# {1, 2, 3, 4}

If the element you’re trying to add is already in the set, .add() will do nothing:

my_set = {1, 2, 3}
my_set.add(2)
print(my_set)
# {1, 2, 3}

Or add members in bulk

To add more than one element at once, use set.update() with a list:

my_set = {1, 2, 3}
my_set.update([4, 5])
print(my_set)
# {1, 2, 3, 4, 5}

Update, like add, will not add any duplicate values:

my_set = {1, 2, 3}
my_set.update([2, 3, 4])
print(my_set)
# {1, 2, 3, 4}

Remove members from a set

There are several options here:

set.discard(n) – removes n from the set, does nothing if n isn’t present. Returns None.
set.remove(n) – removes n from the set, raises a KeyError if n isn’t present. Returns None.
set.pop() – removes a random element of the set. Raises a KeyError if the set is already empty. Returns the element which was removed.
set.clear() – empties the entire set. Returns None

my_set = {1, 2, 3, 4, 5}
my_set.discard(3) # {1, 2, 4, 5}
my_set.discard(3) # {1, 2, 4, 5}
my_set.remove(2) # {1, 4, 5}
my_set.remove(2) # KeyError: 2

val = my_set.pop()
print(val, my_set)
# 1, {4, 5}

my_set.clear() # {}

Determine if a list has duplicate values

This comes in handy often when doing quick investigation work:

my_list = [1, 2, 3, 4, 2, 3, 6]

# Set members are always distinct
# This will automatically dedupe the list
my_set = set(my_list)

len(my_list) # 7
len(my_set) # 5

Determine the difference between sets

There are two … different ways to do this: difference and symmetric_difference.

Using set.difference()

Calling a.difference(b) will give you a new set containing the elements that are in a but not in b. Order matters here, so a.difference(b) will give different results from b.difference(a):

a = {1, 2, 3}
b = {2, 3, 4}
unique_to_a = a.difference(b)
# {1}

# To get values unique to b, switch the order
unique_to_b = b.difference(a)
# {4}

Python also gives us a shorthand for set.difference, the - sign:

a = {1, 2, 3}
b = {2, 3, 4}
unique_to_a = a - b
# {1}

Using set.symmetric_difference()

Symmetric difference between sets is defined as all elements in either set which are not in both sets. Using the same a and b:

a = {1, 2, 3}
b = {2, 3, 4}

# order doesn't matter for symmetric_difference
a.symmetric_difference(b)
# {1, 4}

# This also has a shorthand operator: ^
a ^ b
# {1, 4}

I’m not sure I’d recommend using the ^ operator here as it’s not very commonly-seen and could confuse readers of your code.

Bonus: set.isdisjoint()

This will return True if two sets have no common elements:

a = {1, 2, 3}
b = {4, 5, 6}
a.isdisjoint(b)
True

From the Python docs: Sets are disjoint if and only if their intersection is the empty set.

Bonus: compare dictionary keys

This has come in handy for me when investigating large dicts. Since a Python dict is technically an iterable, it can be passed into a set(), which is a quick way to see if two objects have the same shape:

person = {"name": "justin"}
not_a_person = {"name": "Toyota", "model_year": 2007}

# It seems obvious with these small dicts
# but when there are dozens or hundreds of keys
# this comes in handy
set_one = set(person) # {"name"}
set_two = set(not_a_person) # {"name", "model_year"}

set_one == set_two # False
set_one.symmetric_difference(set_two) # {"model_year"}

Note that above, only the dict keys are passed into the set. That’s due to the iterable nature of Python dicts—only the keys are iterated over. To get the values also, you need dict.items().

Finding set intersections

Use the very appropriately-named intersection() to get a new set containing the values common to both sets:

a = {1, 2, 3}
b = {2, 3, 4}
a.intersection(b)
# {2, 3}

# Intersection also has a shorthand operator: &
a & b
# {2, 3}

Supersets and subsets

Use set.issuperset() or set.issubset()¹:

a = {1, 2, 3}
b = {1, 2}

a.issuperset(b) # True
b.issubset(a) # True

# Order matters
a.issubset(b) # False
b.issuperset(a) # False

Combine two (or more) sets

You can use the union command to combine sets:

a = {1, 2, 3}
b = {3, 4, 5}
a.union(b)
# {1, 2, 3, 4, 5}

# This has a shorthand also: |
a | b
# {1, 2, 3, 4, 5}

Frozenset – Immutable sets

The frozenset class is a set which is immutable after it’s created. Once initialized, nothing can ever be added to or removed from a frozen set:

a = frozenset([1, 2, 3])
a.add(2)
# AttributeError: 'frozenset' object has no attribute 'add'

a.clear()
AttributeError: 'frozenset' object has no attribute 'clear'

This immutability allows frozen sets to be hashable, meaning they can be used as members of other sets or as keys in a dictionary.

Believe it or not, there are more set methods, and more shorthand operators which I didn’t cover here. To learn more, check out the official Python docs.

Notes

I’m not sure why Python broke with its usual snake_case for issuperset, issubset, and isdisjoint—it makes them harder to read / write. ︎

The post Python sets appeared first on Justin Joyce.

Python try except

Justin — Thu, 01 Jun 2023 12:03:42 +0000

Try and except are the building blocks of exception handling in Python. You’ll also sometimes see finally and else. Here’s the summary:

try: run potentially-error-raising code in here
except: catches and handles errors that might have occurred in the try
finally: always runs after try / except, even if there were returns or re-raises
else: runs if try did not raise an error and try did not return

Try and Except

A simple try-except block looks like this:

try:
    risky_thing()
except Exception as e:
    print(f"oh no! exception: {e}")

You’ll usually see try and except by themselves; finally is used less often, and else even less. Here’s a (slightly) more realistic example:

me = {"name": "justin"}

def get_age(person):
    try:
        return person["age"]
    except KeyError as e:
        print(f"caught key error: {e}")

# There's no 'age' key on the dict
get_age(me)
# caught key error: 'age'

Except blocks optionally accept a specific error type—the example above will only catch a KeyError. In practice, you should always specify error type. If your code could produce multiple types of errors, just add an additional except block for each type:

try:
	  dangerous_code()
except TypeError as e:
    # handle error
except KeyError as e:
    # handle error
except IndexError as e:
    # handle error

Or group your exception handlers with parentheses and commas:

try:
    dangerous_code()
except (KeyError, IndexError) as e:
    # handle these two errors
except (ValueError, TypeError) as e:
    # handle these two errors

This way you know exactly how your code failed, and can log or fix it appropriately.

However, if you just need to ensure your code won’t blow up and you don’t care what kind of exception was raised, you can catch the Exception class:

try:
	  dangerous_code()
except Exception as e:
	  print(f"hit an error: {e}")

All non-fatal Python exception classes inherit from Exception, so you’ll catch almost any exception this way.

me = {"name": "justin", "age": 100}

def get_age(person):
	  try:
        print("Getting age")
  	    return person["age"]
	  except KeyError as e:
	      print("key error hit")
    finally:
        print("finally block hit")
        
my_age = get_age(me)
# Getting age
# finally block hit

print(my_age)
# 100

Finally

Finally executes code at the end of your try-except block and is typically used to perform some kind of cleanup action, like ensuring a file was closed¹. Finally will run whether or not there was an exception, even if the exception was unhandled. Said another way, finally will always run:

me = {"name": "justin", "age": 100}

def get_age(person):
	  try:
		    print("Getting age")
		    return person["age"]
	  except KeyError as e:
		    print("key error hit")
	  finally:
        print("finally block hit")
        
my_age = get_age(me)
# Getting age
# finally block hit

print(my_age)
# 100

You might notice above that the finally block printed even though it’s down below a return statement. That’s unusual in Python; usually return exits the function before anything below it can run. Finally is an exception to this pattern, it will always run. To really drive this point home, let’s raise an unhandled exception:

# Don't do this
def bad_method():
	  try:
		    print("Starting")
        raise TypeError("AHHH")
    except KeyError as e:
        print("this won't hit, wrong error type")
    finally:
        print("the finally")
        
bad_method()
# Starting
# the finally
# TypeError: AHHH

Notice above that the raise still happened; bad_method() still raised the error, but it did it after the finally block executed.

One other important note about finally: if a finally clause includes a return statement, the returned value will be the one from the finally clause’s return statement, not the value from the try clause’s return statement²:

me = {"name": "justin", "age": 100}

def get_age(person):
	  try:
        print("Getting age")
		    return person["age"]
	  except KeyError as e:
		    print("key error hit")
	  finally:
    	  print("finally block hit")
        return "hello"
        
my_age = get_age(me)
# Getting age
# finally block hit

print(my_age)
# hello

In the example above my_age is “hello”, despite the return statement in the try block which should have returned 100. That’s because a return statement inside a finally always takes precedence. If the finally does not contain a return statement the function will use the return value from the try or except blocks.

A return statement within a finally block is always the value returned from its function, even if there’s an unhandled exception in its try / except blocks. For that reason you should never return from within a finally block—you could accidentally swallow exceptions without even knowing it.

Else

Within try / except, an else block is used to evaluate code only if the try block did not raise an exception or return:

def else_example1()
	  try:
    	  print("no errors here")
	  except Exception as e:
    	  print("this won't print")
    else:
    	  print("this will print")
        
else_example1()
# no errors here
# this will print

def else_example2()
	  try:
    	  print("no errors here")
        return "hi"
	  except Exception as e:
    	  print("this won't print")
    else:
    	  print("this will also not print")
        
else_example2()
# no errors here
"hi"

I rarely see else used in this context, and it might indicate a code smell—in many cases you can accomplish the same thing through appropriate use of raise.

Helpful Links

Errors and Exceptions – Python.org
Why you should specify error types in your except – Stackoverflow

Notes

This is also why we use context managers ︎
This note about finally was lifted straight from the Python docs ︎

The post Python try except appeared first on Justin Joyce.

Python double slash operator

Justin — Wed, 10 May 2023 01:44:28 +0000

Python’s double slash (//) operator performs floor division.

What exactly is floor division?

Floor division is a normal division operation except that it returns the largest possible integer. This integer is either less than or equal to the normal division result.

– Educative.io

In code, it looks like this:

# Regular python division
8 / 3. # 2.6666666

# Floor division
8 // 3. # 2

Some languages perform floor division by default when dividing integers, like Go and (surprisingly) Ruby. In Python, you have to use //.

Note: floor division always rounds down, not towards 0. It’s a possible gotcha if you’re working with negative numbers:

# Regular python division
-8 / 3. # -2.6666666

# Floor division always rounds down
-8 // 3. # -3

Python

The post Python double slash operator appeared first on Justin Joyce.

Python for loops

Justin — Mon, 06 Mar 2023 12:51:11 +0000

For loops in Python are one of many features that make Python so popular; they’re as close to plain English as you can get when writing software. They generally look like this:

for element in iterable:
	# do things with element

What is an iterable? According to the Python docs an iterable is:

An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects, …

In other words, an “iterable” is a “thing that can be looped through”.

Let’s do some examples.

Python “for in” loop

This very English-sounding for loop construction is one of the reasons I love Python. It will perform a single iteration for each element in your iterable:

# lists
for num in [1, 2, 3]:
	print(num)
# 1
# 2
# 3

Most iterable types work exactly how you’d expect:

# sets
for num in set([1, 2, 3]):
	print(num)
# 1
# 2
# 3

Even strings work the same:

# strings
for letter in "abc":
	print(letter)
# a
# b
# c

The continue statement

Similar to many other languages, continue will skip to the next iteration of a loop:

for num in [1, 2, 3, 4]:
	if num == 3:
    	continue
    print(num)
# 1
# 2
# 4

The break statement

The break statement does what its name implies, breaks the loop:

for num in [1, 2, 3, 4]:
	if num == 3:
    	break
    print(num)
# 1
# 2

If you have nested for loops, break will only break the innermost loop:

def nested():
	for num in [1, 2, 3]:
		for char in "abc":
    		if char == "b":
        		break
	        print(num, char)

# We won't see 'b' or 'c'
# since we 'break' before printing 'b'
nested()
# 1 a
# 2 a
# 3 a

The else statement

In a for loop, else will execute code at the end of the loop:

def loop_with_else():
	for num in [1, 2, 3]:
    	print(num)
    else:
        print("else block")
    
    print("done")
        
loop_with_else()
# 1
# 2
# 3
# "else block"
# "done"

However, the else block will not execute if you have a break or return statement inside your loop:

def loop_with_else():
    for num in [1, 2, 3]:
        if num == 2:
            break
        print(num)
    else:
        print("else block")

    print("done")

loop_with_else()
# 1
# "done"

Return from within for loops

Another way to break out of for loops is via the return keyword. Unlike break, return will break out of all loops contained within its parent function:

def nested():
    for num in [1, 2, 3]:
        for char in "abc":
            if char == "b":
                return "returned"
            print(num, char)

# We won't see numbers 2 or 3
# Since we return at loop 1,b
nested()
# 1 a
"returned"

Python for loop with dicts

Looping through dicts might require a small tweak. By default, using for in with a dictionary will only get you the keys:

my_dict = {"first_name": "justin", "last_name": "joyce"}

for thing in my_dict:
	print(thing)
# first_name
# last_name

However, this is an often-used pattern with dictionaries, and with one small addition we can access both keys and values—we need dict.items():

my_dict = {"first_name": "justin", "last_name": "joyce"}

for k, v in my_dict.items():
	print(f"Key is: {k}, Value is: {v}")
# Key is: first_name, Value is: justin
# Key is: last_name, Value is: joyce

This is exactly how you can invert a Python object, which I cover in another post.

Looping with the range function

You’ll sometimes see examples using python’s range() builtin function:

for x in range(3):
	print(x)
# 0
# 1
# 2

Using range is less common than the for in structure in my experience, but you’ll see both.

Unpacking nested iterables

Unpacking nested iterables within a single statement like this might not be the best idea, but Python will allow it:

def unpacking():
    for num, char in [(1, "a"), (2, "b"), (3, "c")]:
        print(num, char)

unpacking()
# 1 a
# 2 b
# 3 c

If you find yourself doing this you might want to consider refactoring—this kind of thing gets confusing quickly.

Helpful Links

Definition of iterable – Python docs
Swapping dict keys and values in Python – me!

The post Python for loops appeared first on Justin Joyce.

Python List Comprehensions

Justin — Tue, 28 Feb 2023 00:45:20 +0000

List comprehensions provide you with a “concise way to create lists”, according to the official docs, but they do a lot more than that. They’re a great feature of Python; let’s do a few examples to illustrate why.

List comprehension as a map function

my_list = [1, 2, 3]

doubled = [num * 2 for num in my_list]
# [2, 4, 6]

The example above is the same as using a map function in other languages, but more concise. You can also replicate filter with list comprehensions.

List comprehension as a filter function

my_list = [1, 2, 3]

odds = [num for num in my_list if num % 2 != 0]
# [1, 3]

Python does have builtins for map and filter, but I almost always find myself using list comprehensions instead; they’re shorter and much more pythonic.

If you have more complicated logic you can do something like this:

new_list = [complicated_function(i) for i in old_list]

# when you have a filter function
filtered_list = [i for i in old_list if complicated_filter(i)]

All of the examples above can equally be written as traditional for in loops, but list comprehension syntax is very commonly-used in Python; you’ll see it everywhere.

That said, list comprehensions certainly are not a drop-in replacement for traditional for loops, and if you have to perform multiple operations during a loop, or you need to add a nice explainer comment, you might need an old-school for.

List Comprehensions for other types

The name “list” comprehension is a bit of a misnomer, the syntax functions on anything iterable in Python: lists, dicts, sets, tuples, even strings. Since we already covered lists above, let’s do some examples with other iterable types.

Dicts

List comprehensions are commonly used in combination with dict.items() to invert a Python dictionary:

my_dict = {"a": 1, "b":2}

{value: key for key, value in my_dict.items()}
# {1: "a", 2: "b"}

Sets

Loop over items and create a set:

my_list = [1, 2, 3, 4]

# pluck out odd values and build a new set
my_set = {val for val in my_set if val % 2 != 0}
# {1, 3}

type(my_set) # set

Tuples

Pluck out a single value from a list of tuples:

list_of_tuples = [('nic', 'cage'), ('tom', 'hanks')]

first_names = [a for (a, b) in list_of_tuples]
# ['nic', 'tom']

Conclusion

List comprehensions are one of the features that make Python my personal favorite language to work in. They’re very concise, and they make it easy to work with and convert between all of Python’s iterable types.

Helpful Links

List Comprehensions – Python official docs
Python Iterables
Python For Loops – Me

The post Python List Comprehensions appeared first on Justin Joyce.

Writing CSVs in Python

Justin — Mon, 13 Feb 2023 13:00:57 +0000

To write CSVs in Python, you’ll want the builtin csv module. More specifically, I usually use csv.DictWriter.

Python DictWriter

Python’s csv module has both a writer and a DictWriter class, but I’m virtually always working with dictionaries, so I always use DictWriter.

It’s pretty straightforward. You grab your data, open a file in a context manager, create a writer, and write:

from csv import DictWriter

nic_cage = {
	  "first_name": "Nicolas",
	  "last_name": "Cage",
	  "oscars": 1, # yep, he won one
	  "description": "was in National Treasure",
}

tom_hanks = 
	  "first_name": "Tom",
    "last_name": "Hanks",
	  "oscars": 2,
    "description": "IS a national treasure",
}

actors = [nic_cage, tom_hanks]

# Open a file in write mode
with open("actors.csv", "w") as outfile:
	  # The dict keys will be the csv headers
    headers = tom_hanks.keys()
    
    # create a writer
    writer = DictWriter(outfile, fieldnames=headers)

	  # write the header row
  	writer.writeheader()

	  # write the rest of the rows
    writer.writerows(actors)

That’s it! Now you have a csv with a header row and two entries.

Important note: if any of your dicts have additional keys that aren’t in your defined set of headers, this will fail with ValueError: dict contains field not in fieldnames:

# This example will fail to write with a ValueError
from csv import DictWriter

tom_hanks = { "first_name": "Tom", "oscars": 1, "national_treasure": True }
nic_cage = { "first_name": "Nicolas", "oscars": 1 }

with open("actors.csv", "w") as outfile:
	  # tom_hanks has an additional key in his data
    # This csv will fail to write
    headers = ["first_name", "oscars"]
    
    writer = DictWriter(outfile, fieldnames=headers)
	  writer.writeheader()
    writer.writerows([tom_hanks, nic_cage])
    
# ValueError: dict contains field not in fieldnames: 'national_treasure'

However, if any of your dicts is missing a key which was defined in your headers, that piece of data will just be blank:

# This example will save fine, but with some blank data
from csv import DictWriter

tom_hanks = { "first_name": "Tom", "oscars": 2 }
meryl_streep = { "first_name": "Meryl", "last_name": "Streep" "oscars": 3 }

with open("actors.csv", "w") as outfile:
	  # tom_hanks is missing "last_name" above
    # This csv will save fine, but his last_name will be blank
    headers = ["first_name", "oscar_count", "last_name"]
    
    writer = DictWriter(outfile, fieldnames=headers)
	  writer.writeheader()
    writer.writerows([tom_hanks, meryl_streep])

Easy.

Alternative Option: Pandas

The most often-used alternative is likely pandas data_frame.to_csv(). If you do any kind of data analysis, you’re probably familiar with pandas, so I’ll just do a quick to_csv example:

import pandas as pd

# Pandas works on DataFrames, let's build a tiny one
df = pd.DataFrame(
	  [["Tom", "Hanks"], ["Meryl", "Streep"]],
    columns=["first_name", "last_name"],
)

"""
df looks like this
  first_name last_name
0        Tom     Hanks
1      Meryl    Streep
"""

# You'll probably want index=false, or your CSV will save
# the row indices 0 and 1 in an unnamed first column
df.to_csv("actors.csv", index=false)

All done. Pandas saved a CSV with headers.

Helpful Links

csv.DictWriter – Python official docs
Python Context Managers – Me!
Reading CSVs in Python – Also me!

The post Writing CSVs in Python appeared first on Justin Joyce.

Python “is” operator vs double equals “==”

Justin — Sat, 11 Feb 2023 14:13:46 +0000

Python’s is operator compares object identity, while == compares object values.

Python “is” operator

In Python, is compares identity. In other words, it checks if two objects are the same object. It does not care if they have equal values, it cares if they have the same id in memory. This is why you often see is None comparisons in Python; there is only one None. On my machine, this is its id:

Here are some examples of how is behaves:

1 is 1 # True, this is a primitive value
"a" is "a" # True, this is a primitive value

my_obj = {}
my_obj is my_obj # True, this is the same object

{} is {} # False, these are two different dicts
[] is [] # False, these are two different lists

a = 500
b = 500
a is b # False. Yep, False.

You might see the last example above and think: “What? 500 is not 500?” Nope, not if you’re using is. At startup, Python caches—Python calls it “interning”—a set of commonly-used integers. Specifically, it pre-builds -5 to 256. Any integers outside of that range are constructed as they’re needed, and each construction will have a different location in memory. The same thing holds true for strings:

one = "justin"
two = "justin"

one is two # True. This seems ok..

three = "just in"
four = "just in"

three is four # False. Wait what?

The full explanation for why strings behave this way is a bit long for this post, but if you’re curious here’s a thorough stack overflow explanation. Tldr: When comparing strings or numbers, you should use ==.

Python double equals `==` operator

Unless you’re comparing to None, True, or False, == should be your default. Python’s == will compare the values being tested, not their identities. Let’s use the same examples as above:

1 == 1 # True
"a" == "a" # True

my_obj = {}
my_obj == my_obj # True

{} == {} # True
[] == [] # True

a = 500
b = 500
a == b # True

That’s better.

Under the hood, == uses an object’s __eq__ method, which typically looks something like this:

my_dict = {}
help(my_dict.__eq__)

"""
...
__eq__(self, value, /)
    Return self==value.
...
"""

Instead of comparing memory locations—which is almost certainly not what you’re after—__eq__ compares the actual values.

__eq__ is one of many double-underscore or “dunder” methods in Python. There are tons of dunder methods built into common Python objects, this post has lots of details.

Conclusion

Unless you know for certain it’s safe to use is, you should use ==. If you’re comparing strings or numbers, you should always use ==.

Helpful Links

The post Python “is” operator vs double equals “==” appeared first on Justin Joyce.

Python Archives | Justin Joyce

Reverse a string in Python

Replace a string in Python

Python’s String Replace Method

Python Regex re.sub

JSON in Python

json.loads (load string)

Default value parsing

json.dumps (dump string)

With pretty printing

json.dump (to file)

json.load (from file)

Type conversions

Errors when reading or writing json

Gotcha: Decimal type

Python sets

Creating a set

Check if a set contains a member

Add members to a set

Add members one at a time

Or add members in bulk

Remove members from a set

Determine if a list has duplicate values

Determine the difference between sets

Using set.difference()

Using set.symmetric_difference()

Bonus: set.isdisjoint()

Bonus: compare dictionary keys

Finding set intersections

Supersets and subsets

Combine two (or more) sets

Frozenset – Immutable sets

More

Python try except

Try and Except

Finally

Else

Helpful Links

Notes

Python double slash operator

Python for loops

Python “for in” loop

The continue statement

The break statement

The else statement

Return from within for loops

Python for loop with dicts

Looping with the range function

Unpacking nested iterables

Python List Comprehensions

List comprehension as a map function

List comprehension as a filter function

List Comprehensions for other types

Dicts

Sets

Tuples

Conclusion

Writing CSVs in Python

Python DictWriter

Alternative Option: Pandas

Python “is” operator vs double equals “==”

Python “is” operator

Python double equals == operator

Conclusion

Helpful Links

Python double equals `==` operator