<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>OCR on recca0120 技術筆記</title><link>https://recca0120.github.io/tags/ocr/</link><description>Recent content in OCR on recca0120 技術筆記</description><generator>Hugo -- gohugo.io</generator><language>zh-hant-tw</language><lastBuildDate>Fri, 24 Apr 2026 21:00:00 +0800</lastBuildDate><atom:link href="https://recca0120.github.io/tags/ocr/index.xml" rel="self" type="application/rss+xml"/><item><title>MinerU 實測：把 PDF 論文變成 RAG 吃得下的 Markdown</title><link>https://recca0120.github.io/2026/04/24/mineru-pdf-to-markdown/</link><pubDate>Fri, 24 Apr 2026 21:00:00 +0800</pubDate><guid>https://recca0120.github.io/2026/04/24/mineru-pdf-to-markdown/</guid><description>餵 PDF 給 LLM 最痛的就是公式、表格、雙欄排版被拆爛。我用 MinerU 2.5 把一份多欄學術 PDF 轉成 Markdown，公式變 LaTeX、表格變 HTML、閱讀順序也對，CPU 模式就跑得動。</description><content:encoded>&lt;![CDATA[餵 PDF 給 LLM 最痛的就是公式、表格、雙欄排版被拆爛。我用 MinerU 2.5 把一份多欄學術 PDF 轉成 Markdown，公式變 LaTeX、表格變 HTML、閱讀順序也對，CPU 模式就跑得動。<br/><img src="https://recca0120.github.io/2026/04/24/mineru-pdf-to-markdown/featured.png" alt="Featured image"/>]]></content:encoded><enclosure url="https://recca0120.github.io/2026/04/24/mineru-pdf-to-markdown/featured.png" type="image/png" length="0"/><media:content url="https://recca0120.github.io/2026/04/24/mineru-pdf-to-markdown/featured.png" medium="image"/><category>MinerU</category><category>PDF</category><category>RAG</category><category>OCR</category><category>LLM</category><category>AI</category></item></channel></rss>