• luciferofastora@feddit.org
    link
    fedilink
    arrow-up
    1
    ·
    1 day ago

    Well, to be a little charitable, sometimes it’s text with numbers in it. I just need to figure out how best to extract the numbers from unstructured text, which is mostly tedious to validate.

    Other times it’s text where there are supposed to be numbers, like the dates on invoices, which leads to really funny mixups when we look at the revenue per supplier and someone asks “Hey, we didn’t bring this supplier on until 2019, why are there revenues for 2012?” And the answer is “Because your invoice date is a manually entered text field and if you’re a quick typer, 2021 and 2012 are really close together.”

    And then some times it’s questions like “How many customer service tickets do we get about X”. If X is a specific product name, odds are a simple full text search for the term gets most of them. If X is a general thing like “Office supplies” it becomes a nightmare really quickly.