Using sort_naturally

[2]:
import pandas_flavor as pf
import pandas as pd
import janitor

Let’s say we have a pandas DataFrame that contains wells that we need to sort alphanumerically.

[3]:
data = {
    "Well": ["A21", "A3", "A21", "B2", "B51", "B12"],
    "Value": [1, 2, 13, 3, 4, 7],
}
df = pd.DataFrame(data)
df
[3]:
Well Value
0 A21 1
1 A3 2
2 A21 13
3 B2 3
4 B51 4
5 B12 7

A human would sort it in the order:

A3, A21, A21, B2, B12, B51

However, default sorting in pandas doesn’t allow that:

[4]:
df.sort_values("Well")
[4]:
Well Value
0 A21 1
2 A21 13
1 A3 2
5 B12 7
3 B2 3
4 B51 4

Lexiographic sorting doesn’t get us to where we want. A12 shouldn’t come before A3, and B11 shouldn’t come before B2. How might we fix this?

[5]:
df.sort_naturally("Well")
[5]:
Well Value
1 A3 2
0 A21 1
2 A21 13
3 B2 3
5 B12 7
4 B51 4

Now we’re in sorting bliss! :)