A neat duckdb snipped for string normalization
A recent project of mine involved determining duplicate CRM objects across Salesforce and Hubspot. I utilized duckdb for my data processing and found this neat little text function duckdb provides: strip_accents(string). It does exactly what it says: Strip accents from a string. Thus Mühleisen becomes Muheisen. This feature saved me from manually defining a map of umlaut characters and replacing them in a bunch of places. SELECT strip_accents(first_name) as first_name_normalized, ....