ParquetWriterProperties class
This class holds settings to control how a Parquet file is read by ParquetFileWriter . class
The parameters compression
, compression_level
, use_dictionary
and write_statistics` support various patterns:
NULL
leaves the parameter unspecified, and the C++ library uses an appropriate default for each column (defaults listed above)compression
) applies to all columnsUnlike the high-level write_parquet , ParquetWriterProperties
arguments use the C++ defaults. Currently this means "uncompressed" rather than "snappy" for the compression
argument.
The ParquetWriterProperties$create()
factory method instantiates the object and takes the following arguments:
table
: table to write (required)version
: Parquet version, "1.0" or "2.0". Default "1.0"compression
: Compression type, algorithm "uncompressed"
compression_level
: Compression level; meaning depends on compression algorithmuse_dictionary
: Specify if we should use dictionary encoding. Default TRUE
write_statistics
: Specify if we should write statistics. Default TRUE
data_page_size
: Set a target threshold for the approximate encoded size of data pages within a column chunk (in bytes). Default 1 MiB.write_parquet
Schema for information about schemas and metadata handling.
Useful links