How to Work with String Data in Data.olllo AI Chat

With Data.olllo AI Chat, you can clean, standardize, and extract key information from messy text data — without memorizing complex syntax or regex. Just describe what you need, and the AI will generate a ready-to-run process(dfs) function for you.

Example: Clean and extract text data automatically in Data.olllo

Screenshot example: Enter your requirement → Data.olllo writes and runs the correct code

1. Convert Text to Uppercase

Standardize text by making all letters uppercase:

def process(dfs):
    df = dfs["df"]
    df["customer_name"] = df["customer_name"].str.upper()
    return df

2. Extract Mobile Numbers from Text

Pull valid phone numbers from unstructured text using regex:

def process(dfs):
    import re
    df = dfs["df"]
    df["mobile"] = df["notes"].str.extract(r'(\+?\d[\d\s-]{7,}\d)')
    return df

3. Extract Email Addresses

Isolate emails from messy contact info:

def process(dfs):
    df = dfs["df"]
    df["email"] = df["contact_info"].str.extract(r'([\w\.-]+@[\w\.-]+)')
    return df

4. Remove Special Characters

Keep only letters, numbers, and spaces:

def process(dfs):
    df = dfs["df"]
    df["clean_text"] = df["raw_text"].str.replace(r'[^A-Za-z0-9 ]+', '', regex=True)
    return df

5. Split and Combine Strings

Split full names into first and last names, or merge columns:

def process(dfs):
    df = dfs["df"]
    name_split = df["full_name"].str.split(" ", n=1, expand=True)
    df["first_name"] = name_split[0]
    df["last_name"] = name_split[1]
    df["full_name_merged"] = df["first_name"] + " " + df["last_name"]
    return df

💡 Pro Tips

  • You can combine multiple cleaning steps into a single AI prompt.
  • Always preview extracted data — regex patterns can sometimes capture unintended matches.
  • For complex extractions, specify examples in your prompt for better accuracy.