News
This repository contains a complete pipeline for extracting structured data from Albert Heijn (AH) grocery receipts. It performs PDF OCR, text parsing, and tabular formatting, ultimately producing a ...
With Apache Spark Declarative Pipelines, engineers describe what their pipeline should do using SQL or Python, and Apache Spark handles the execution.
sport-activities-features --- A minimalistic toolbox for extracting features from sports activity files written in Python ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results