A client sent me a 45-page PDF of quarterly sales data last month—12 tables, merged headers, conditional number formatting per region. "Can you get this into Excel by tomorrow?" What should've been a five-minute job turned into an afternoon of wrestling with misaligned columns, merged cells that exploded into 47 sub-cells, and currency values that became plain text.
So I ran a systematic test. Same 12-table PDF, five free converters, one goal: open in Excel without needing an hour of manual cleanup.
I created a realistic PDF with the kind of complexity that trips up most converters:
I ran each converter twice—once with the default settings and once with any available "preserve formatting" option enabled.
| Converter | Column Alignment | Merged Cells | Number Formats | Overall |
|---|---|---|---|---|
| Formly | ✓ Perfect | ✓ Preserved | ✓ Retained | Best |
| Smallpdf | ✓ Good | ✗ Lost | Partial | OK |
| Adobe Acrobat | ✓ Perfect | ✗ Split | ✓ Retained | Good |
| iLovePDF | ✗ Misaligned | ✗ Lost | ✗ Plain text | Worst |
| Zamzar | Partial | ✗ Lost | ✗ Plain text | Poor |
This was the most common failure mode. A header like "Q1 Revenue" merged across columns B-D becomes three separate cells—"Q1", "Revenue", and an empty cell—each in the wrong position. The iLovePDF output for my 12-table document had 94 extra columns from split mergers alone. I had to manually re-merge 38 cell ranges to restore the original structure.
The converters that got this right (Formly, Adobe) both read the PDF's internal table structure rather than relying on visual OCR. If you're testing converters, look for this specifically—run a PDF with at least one merged header and check the Excel output before committing to a tool.
$1,234.56 looks fine in Excel as text—until you try to sum a column and get zero. Three of the five converters output all numbers as plain text strings. Excel's "Convert to Number" can fix simple cases, but with currency symbols and comma separators, it's a manual cell-by-cell operation.
The two converters that preserved number formatting both read the PDF's font encoding metadata to distinguish numeric characters from text. It's not magic—it's metadata that most converters ignore.
My test PDF had one table with an intentional blank separator row. Two converters treated that blank row as a table boundary and split the table into two, shifting all subsequent columns right by one position. If you're converting PDFs with section breaks or visual separators between data rows, check that the output has the same number of rows as the source.
For most PDF-to-Excel jobs, I now use Formly's PDF to Excel converter—it's what I built after that 45-page quarterly report disaster. It runs in your browser, handles merged cells and number formats correctly (because it reads the PDF structure, not just the visual layout), and doesn't upload your data anywhere. For sensitive financial documents, that last part matters.
If you're dealing with a scan-based PDF (image of a table, not text), you'll need something with OCR—Adobe Acrobat Pro handles these best, though it's not free. For text-based PDFs—which is most business documents—a browser-based converter is faster, free, and produced the cleanest output in my tests.