ABOUT 2 MONTHS AGOΒ β€’Β 4 MIN READ

Data Cleaning with AI - 3 methods you should master

profile

AI for Finance

Practical Applications of AI in Finance, Python and machine learning for FP&A

Every month, the same problem. A finance dataset lands in your inbox, multiple tabs, one per region, data starting halfway down the sheet, subtotals scattered between rows, some columns in quarters, others in halves or months.

Before you can do any analysis, you spend hours cleaning it.

This guide shows you three ways to eliminate that work using AI. Whether you need a quick one-time fix, a repeatable monthly process, or want to keep everything inside Excel β€” there is a method for you.

πŸ“ All datasets, prompts, and templates referenced in this guide are available here:

​​​Data Cleaning with AI by Christian Martinez.xlsx​

And video guide here:

video preview​

What We’re Working With

The synthetic dataset used in this guide is intentionally messy β€” because that is what real finance files look like. Here is what makes it difficult:

– Multiple tabs, one per region, with no consistent starting row

– Data does not begin at cell A1

– Subtotals appear mid-table without warning

– Some rows use quarterly data, others use halves or months

– Totals missing the yearly figure altogether

– Inconsistent formatting across regions

The goal: turn this into a clean, consolidated dataset β€” one starting row per region at A1 like this:

And it also has a cleaning log so you can audit exactly what changed.

Method 1 : Data Cleaning with Claude

Best for: one-time cleanups

If this is an ad hoc exercise β€” something you only need to do once β€” the fastest path is to use Claude directly in the chat interface.

How it works

1. Open Claude at claude.ai

2. Use the + button to attach your Excel file

3. Paste in the prompt from the file below β€” then customise it for your dataset

4. Claude returns cleaned data and a cleaning log

5. Click download β€” file goes straight to your downloads folder

The prompt provided works with the sample dataset. For your own file, you will need to describe your specific columns, inconsistencies, and what the clean output should look like.

The prompt gives you the structure β€” adjust the details.

​https://claude.ai/public/artifacts/d92731ad-8069-4de9-b846-8ead914f1d46​

When to use this

– You have a one-off file that needs cleaning

– You do not need to repeat the process next month

– Speed matters more than reproducibility

Method 2 Data Cleaning with Claude Artifacts

Best for: repeatable monthly processes

If you need to clean the same type of file every month, building it once as a Claude Artifact makes far more sense.

An Artifact is a mini-app Claude builds for you β€” one you can reuse, share with your team, and run on any new file without touching the prompt again.

I have a decidated video on artifacts here: How to Create Claude Artifacts for Finance and FP&A​

How it works

  1. Open a new Claude chat
  2. Paste in the Artifact prompt (different from Method 1) and attach your dataset
  3. Claude builds a small cleaning app in the Artifacts panel
  4. If you see a processing error, copy it and paste it back β€” Claude fixes it automatically
  5. Upload next month’s file into the same app β€” clean output in seconds
  6. Share the Artifact link with your team so anyone can run the same cleaning

πŸ“ Prompt template for Method 2: https://claude.ai/public/artifacts/d92731ad-8069-4de9-b846-8ead914f1d46​

This is an example on how the app would look:

When to use this

– Same file structure arrives every month

– You want to share the cleaning process with your team

– You want one place to run, audit, and repeat the cleaning

Want more workflows like this?

Join the AI Finance Club β€” a community for finance and accounting leaders building smarter workflows with AI. Templates, prompts, live sessions, and a network of peers doing the same work.

Also explore our AI Finance Accelerator Program. In 6 weeks you’ll automate core reporting, turn numbers into board-ready narratives, and deploy a forecast pipeline you can defend to your CEO and board.

Method 3 Any AI + Python in Google Colab

Best for: those who prefer to stay outside Claude

If you would rather not use Claude β€” or if your organisation uses a different AI tool β€” this method works with any AI that can write Python code. You then run that code in Google Colab, which is free and requires no installation.

How it works

  1. Open any AI tool (ChatGPT, Gemini, Copilot, or Claude)
  2. Describe your dataset and ask for Python code to clean it
  3. Copy the generated code
  4. Open Google Colab and paste the code into a new notebook
  5. Upload your file when prompted and run the cell
  6. Download the cleaned output

πŸ“ Solution Colab notebook : https://colab.research.google.com/drive/1ip0-gJsay7IJb7KFcP2SNFzkI_wyJjlQ?usp=sharing​

Prompt: https://claude.ai/public/artifacts/d92731ad-8069-4de9-b846-8ead914f1d46​

When to use this

– Your organisation restricts access to Claude specifically

– You are already comfortable with a different AI tool

– You want full control over the Python logic

Bonus: Run It Directly in Excel with =PY

No browser. If you have Python in Excel enabled, you can run the entire cleaning script directly inside your Excel file.

You can learn more about Python in Excel including how to download here.

How it works

  1. Open your Excel file with the raw data
  2. Create a new tab and type =PY in any cell
  3. Ask any AI tool to write you Python code for =PY in Excel, describing your sheet
  4. Copy and paste the code into the Python cell
  5. Press Ctrl + Enter to commit
  6. Right-click the result β†’ Python Output as Excel Value β€” your clean data spills into the sheet

The script reads your raw sheet, strips metadata rows, standardises every column name, handles nulls and inconsistent values, recalculates FY totals, and flags anything ambiguous β€” with a full data_flag column so you have an audit trail.

Which Method Should You Use?

Final Thought

Data cleaning is not analysis. Every hour spent reformatting rows is an hour not spent interpreting what the numbers mean. These three methods β€” and the bonus β€” give you options at every level of complexity and tool preference.

Start with the one that fits your situation today. Build from there.

Hope this helps!

Christian Martinez

AI for Finance

Practical Applications of AI in Finance, Python and machine learning for FP&A