git: 89d55115a4c0 - main - converters/py-markitdown: New port
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 20 Dec 2024 02:09:50 UTC
The branch main has been updated by wen:
URL: https://cgit.FreeBSD.org/ports/commit/?id=89d55115a4c0f52bbeb08ea5f5899d6e6b62fa1b
commit 89d55115a4c0f52bbeb08ea5f5899d6e6b62fa1b
Author: Wen Heping <wen@FreeBSD.org>
AuthorDate: 2024-12-20 02:01:25 +0000
Commit: Wen Heping <wen@FreeBSD.org>
CommitDate: 2024-12-20 02:09:15 +0000
converters/py-markitdown: New port
MarkItDown library is a utility tool for converting various files to Markdown
(e.g., for indexing, text analysis, etc.)
It presently supports:
*PDF (.pdf)
*PowerPoint (.pptx)
*Word (.docx)
*Excel (.xlsx)
*Images (EXIF metadata, and OCR)
*Audio (EXIF metadata, and speech transcription)
*HTML (special handling of Wikipedia, etc.)
*Various other text-based formats (csv, json, xml, etc.)
*ZIP (Iterates over contents and converts each file)
---
converters/Makefile | 1 +
converters/py-markitdown/Makefile | 27 +++++++++++++++++++++++++++
converters/py-markitdown/distinfo | 3 +++
converters/py-markitdown/pkg-descr | 13 +++++++++++++
4 files changed, 44 insertions(+)
diff --git a/converters/Makefile b/converters/Makefile
index d963b78583d0..645b3b83065f 100644
--- a/converters/Makefile
+++ b/converters/Makefile
@@ -153,6 +153,7 @@
SUBDIR += py-bsdconv
SUBDIR += py-gotenberg-client
SUBDIR += py-mammoth
+ SUBDIR += py-markitdown
SUBDIR += py-rencode
SUBDIR += py-svglib
SUBDIR += py-text-unidecode
diff --git a/converters/py-markitdown/Makefile b/converters/py-markitdown/Makefile
new file mode 100644
index 000000000000..a9ce7a689d57
--- /dev/null
+++ b/converters/py-markitdown/Makefile
@@ -0,0 +1,27 @@
+PORTNAME= markitdown
+DISTVERSION= 0.0.1a3
+CATEGORIES= converters python
+MASTER_SITES= PYPI
+PKGNAMEPREFIX= ${PYTHON_PKGNAMEPREFIX}
+
+MAINTAINER= wen@FreeBSD.org
+COMMENT= Utility tool for converting various files to Markdown
+WWW= https://pypi.org/project/tlv8/
+
+LICENSE= APACHE20
+
+BUILD_DEPENDS= ${PYTHON_PKGNAMEPREFIX}hatchling>=0:devel/py-hatchling@${PY_FLAVOR}
+RUN_DEPENDS= ${PYTHON_PKGNAMEPREFIX}mammoth>=0:converters/py-mammoth@${PY_FLAVOR} \
+ ${PYTHON_PKGNAMEPREFIX}markdownify>=0:textproc/py-markdownify@${PY_FLAVOR} \
+ ${PYTHON_PKGNAMEPREFIX}pandas>=0:math/py-pandas@${PY_FLAVOR} \
+ ${PYTHON_PKGNAMEPREFIX}pdfminer.six>=0:textproc/py-pdfminer.six@${PY_FLAVOR} \
+ ${PYTHON_PKGNAMEPREFIX}python-pptx>=0:textproc/py-python-pptx@${PY_FLAVOR} \
+ ${PYTHON_PKGNAMEPREFIX}puremagic>=0:sysutils/py-puremagic@${PY_FLAVOR} \
+ ${PYTHON_PKGNAMEPREFIX}requests>=0:www/py-requests@${PY_FLAVOR}
+
+USES= python
+USE_PYTHON= autoplist pep517
+
+NO_ARCH= yes
+
+.include <bsd.port.mk>
diff --git a/converters/py-markitdown/distinfo b/converters/py-markitdown/distinfo
new file mode 100644
index 000000000000..a69065a058ef
--- /dev/null
+++ b/converters/py-markitdown/distinfo
@@ -0,0 +1,3 @@
+TIMESTAMP = 1734654122
+SHA256 (markitdown-0.0.1a3.tar.gz) = f6c8f5f7f5541e91c6c535218318968fefd71e2a6faa0eb782b3492e04cd023d
+SIZE (markitdown-0.0.1a3.tar.gz) = 16073
diff --git a/converters/py-markitdown/pkg-descr b/converters/py-markitdown/pkg-descr
new file mode 100644
index 000000000000..8871cf0e5603
--- /dev/null
+++ b/converters/py-markitdown/pkg-descr
@@ -0,0 +1,13 @@
+MarkItDown library is a utility tool for converting various files to Markdown
+(e.g., for indexing, text analysis, etc.)
+
+It presently supports:
+ *PDF (.pdf)
+ *PowerPoint (.pptx)
+ *Word (.docx)
+ *Excel (.xlsx)
+ *Images (EXIF metadata, and OCR)
+ *Audio (EXIF metadata, and speech transcription)
+ *HTML (special handling of Wikipedia, etc.)
+ *Various other text-based formats (csv, json, xml, etc.)
+ *ZIP (Iterates over contents and converts each file)