ORC Core · The core reader and writer for ORC files. Uses the vectorized column batch for the in memory representation.
Group: org.apache.orc - All Dependencies
ORC MapReduce · An implementation of Hadoop's mapred and mapreduce input and output formats for ORC files. They use the core reader and writer, but present the data to the user in Writable objects.
ORC Shims · A shim layer for supporting various versions of Hadoop dynamically. This module uses a higher version of Hadoop so that we can create shims that let us use new features of Hadoop without having a hard dependency on the latest version.
Apache ORC · ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. It is optimized for large streaming reads, but with integrated support for finding required rows quickly. Storing data in a columnar format lets the reader read, decompress, and process only the values that are required for the current query.
ORC Benchmarks · Benchmarks for comparing ORC, Parquet, and Avro performance.
Apache ORC Format · ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. It is optimized for large streaming reads, but with integrated support for finding required rows quickly. Storing data in a columnar format lets the reader read, decompress, and process only the values that are required for the current query.