package dynparquet

Import Path
	github.com/polarsignals/frostdb/dynparquet (on go.dev)

Dependency Relation
	imports 25 packages, and imported by 7 packages


Package-Level Type Names (total 21)
/* sort by: | */
Buffer represents an batch of rows with a concrete set of dynamic column names representing how its parquet schema was created off of a dynamic parquet schema. (*Buffer) Clone() (*Buffer, error) ColumnChunks returns the list of parquet.ColumnChunk for the given index. It contains all the pages associated with this row group's column. Implements the parquet.RowGroup interface. DynamicColumns returns the concrete dynamic column names of the buffer. It implements the DynamicRowGroup interface. DynamicRows returns an iterator for the rows in the buffer. It implements the DynamicRowGroup interface. NumRows returns the number of rows in the buffer. Implements the parquet.RowGroup interface. (*Buffer) Reset() Rows returns an iterator for the rows in the buffer. It implements the parquet.RowGroup interface. Schema returns the concrete parquet.Schema of the buffer. Implements the parquet.RowGroup interface. (*Buffer) Size() int64 (*Buffer) Sort() SortingColumns returns the concrete slice of parquet.SortingColumns of the buffer. Implements the parquet.RowGroup interface. (*Buffer) String() string WriteRowGroup writes a single row to the buffer. WriteRow writes a single row to the buffer. *Buffer : DynamicRowGroup *Buffer : github.com/polarsignals/frostdb/query/expr.Particulate *Buffer : github.com/parquet-go/parquet-go.RowGroup *Buffer : github.com/parquet-go/parquet-go.RowGroupWriter *Buffer : github.com/parquet-go/parquet-go.RowWriter *Buffer : github.com/parquet-go/parquet-go.RowWriterWithSchema *Buffer : expvar.Var *Buffer : fmt.Stringer func ToBuffer(s Samples, schema *Schema) (*Buffer, error) func (*Buffer).Clone() (*Buffer, error) func (*Schema).NewBuffer(dynamicColumns map[string][]string) (*Buffer, error) func (*Schema).NewBufferV2(dynamicColumns ...*schemav2pb.Node) (*Buffer, error) func (*Schema).SerializeBuffer(w io.Writer, buffer *Buffer) error
ColumnDefinition describes a column in a dynamic parquet schema. Dynamic bool Name string PreHash bool StorageLayout parquet.Node func (*Schema).ColumnByName(name string) (ColumnDefinition, bool) func (*Schema).ColumnDefinitionsForSortingColumns() []ColumnDefinition func (*Schema).Columns() []ColumnDefinition func (*Schema).FindColumn(column string) (ColumnDefinition, bool) func (*Schema).FindDynamicColumn(dynamicColumnName string) (ColumnDefinition, bool) func (*Schema).FindDynamicColumnForConcreteColumn(column string) (ColumnDefinition, bool)
DynamicColumns map[string][]string Row parquet.Row Schema *parquet.Schema func NewDynamicRow(row parquet.Row, schema *parquet.Schema, dyncols map[string][]string, fields []parquet.Field) *DynamicRow func (*DynamicRows).Get(i int) *DynamicRow func (*DynamicRows).GetCopy(i int) *DynamicRow func github.com/polarsignals/frostdb/parts.Part.Least() (*DynamicRow, error) func github.com/polarsignals/frostdb/parts.Part.Most() (*DynamicRow, error) func github.com/polarsignals/frostdb/pqarrow.RecordToDynamicRow(pqSchema *parquet.Schema, record arrow.Record, dyncols map[string][]string, index int) (*DynamicRow, error) func (*Schema).Cmp(a, b *DynamicRow) int func (*Schema).RowLessThan(a, b *DynamicRow) bool
DynamicRowGroup is a parquet.RowGroup that can describe the concrete dynamic columns. Returns the list of column chunks in this row group. The chunks are ordered in the order of leaf columns from the row group's schema. If the underlying implementation is not read-only, the returned parquet.ColumnChunk may implement other interfaces: for example, parquet.ColumnBuffer if the chunk is backed by an in-memory buffer, or typed writer interfaces like parquet.Int32Writer depending on the underlying type of values that can be written to the chunk. As an optimization, the row group may return the same slice across multiple calls to this method. Applications should treat the returned slice as read-only. DynamicColumns returns the concrete dynamic column names that were used create its concrete parquet schema with a dynamic parquet schema. DynamicRows return an iterator over the rows in the row group. Returns the number of rows in the group. Returns a reader exposing the rows of the row group. As an optimization, the returned parquet.Rows object may implement parquet.RowWriterTo, and test the RowWriter it receives for an implementation of the parquet.RowGroupWriter interface. This optimization mechanism is leveraged by the parquet.CopyRows function to skip the generic row-by-row copy algorithm and delegate the copy logic to the parquet.Rows object. Returns the schema of rows in the group. Returns the list of sorting columns describing how rows are sorted in the group. The method will return an empty slice if the rows are not sorted. ( DynamicRowGroup) String() string *Buffer *MergedRowGroup PooledBuffer github.com/polarsignals/frostdb/index.ReleaseableRowGroup (interface) DynamicRowGroup : github.com/polarsignals/frostdb/query/expr.Particulate DynamicRowGroup : github.com/parquet-go/parquet-go.RowGroup DynamicRowGroup : expvar.Var DynamicRowGroup : fmt.Stringer func Concat(fields []parquet.Field, drg ...DynamicRowGroup) DynamicRowGroup func (*Schema).MergeDynamicRowGroups(rowGroups []DynamicRowGroup, options ...MergeOption) (DynamicRowGroup, error) func (*SerializedBuffer).DynamicRowGroup(i int) DynamicRowGroup func (*SerializedBuffer).MultiDynamicRowGroup() DynamicRowGroup func Concat(fields []parquet.Field, drg ...DynamicRowGroup) DynamicRowGroup func (*Schema).MergeDynamicRowGroups(rowGroups []DynamicRowGroup, options ...MergeOption) (DynamicRowGroup, error)
DynamicRowGroupMergeAdapter maps a RowBatch with a Schema with a subset of dynamic columns to a Schema with a superset of dynamic columns. It implements the parquet.RowGroup interface. Returns the leaf column at the given index in the group. Searches for the same column in the original batch. If not found returns a column chunk filled with nulls. Returns the number of rows in the group. Returns a reader exposing the rows of the row group. Returns the schema of rows in the group. The schema is the configured merged, superset schema. Returns the list of sorting columns describing how rows are sorted in the group. The method will return an empty slice if the rows are not sorted. *DynamicRowGroupMergeAdapter : github.com/polarsignals/frostdb/query/expr.Particulate *DynamicRowGroupMergeAdapter : github.com/parquet-go/parquet-go.RowGroup func NewDynamicRowGroupMergeAdapter(schema *parquet.Schema, sortingColumns []parquet.SortingColumn, mergedDynamicColumns map[string][]string, originalRowGroup parquet.RowGroup) *DynamicRowGroupMergeAdapter
DynamicRowReader is an iterator over the rows in a DynamicRowGroup. ( DynamicRowReader) Close() error ( DynamicRowReader) ReadRows(*DynamicRows) (int, error) Positions the stream on the given row index. Some implementations of the interface may only allow seeking forward. The method returns io.ErrClosedPipe if the stream had already been closed. DynamicRowReader : github.com/parquet-go/parquet-go.RowSeeker DynamicRowReader : github.com/prometheus/common/expfmt.Closer DynamicRowReader : io.Closer func (*Buffer).DynamicRows() DynamicRowReader func DynamicRowGroup.DynamicRows() DynamicRowReader func (*MergedRowGroup).DynamicRows() DynamicRowReader func (*SerializedBuffer).DynamicRows() DynamicRowReader func github.com/polarsignals/frostdb/index.ReleaseableRowGroup.DynamicRows() DynamicRowReader
DynamicColumns map[string][]string Rows []parquet.Row Schema *parquet.Schema (*DynamicRows) Get(i int) *DynamicRow (*DynamicRows) GetCopy(i int) *DynamicRow (*DynamicRows) IsSorted(schema *Schema) bool func NewDynamicRows(rows []parquet.Row, schema *parquet.Schema, dynamicColumns map[string][]string, fields []parquet.Field) *DynamicRows func NewDynamicRowSorter(schema *Schema, rows *DynamicRows) *DynamicRowSorter func DynamicRowReader.ReadRows(*DynamicRows) (int, error)
(*DynamicRowSorter) Len() int (*DynamicRowSorter) Less(i, j int) bool (*DynamicRowSorter) Swap(i, j int) *DynamicRowSorter : sort.Interface func NewDynamicRowSorter(schema *Schema, rows *DynamicRows) *DynamicRowSorter
MergedRowGroup allows wrapping any parquet.RowGroup to implement the DynamicRowGroup interface by specifying the concrete dynamic column names the RowGroup's schema contains. DynCols map[string][]string RowGroup parquet.RowGroup Returns the list of column chunks in this row group. The chunks are ordered in the order of leaf columns from the row group's schema. If the underlying implementation is not read-only, the returned parquet.ColumnChunk may implement other interfaces: for example, parquet.ColumnBuffer if the chunk is backed by an in-memory buffer, or typed writer interfaces like parquet.Int32Writer depending on the underlying type of values that can be written to the chunk. As an optimization, the row group may return the same slice across multiple calls to this method. Applications should treat the returned slice as read-only. DynamicColumns returns the concrete dynamic column names that were used create its concrete parquet schema with a dynamic parquet schema. Implements the DynamicRowGroup interface. DynamicRows returns an iterator over the rows in the row group. Implements the DynamicRowGroup interface. Returns the number of rows in the group. Returns a reader exposing the rows of the row group. As an optimization, the returned parquet.Rows object may implement parquet.RowWriterTo, and test the RowWriter it receives for an implementation of the parquet.RowGroupWriter interface. This optimization mechanism is leveraged by the parquet.CopyRows function to skip the generic row-by-row copy algorithm and delegate the copy logic to the parquet.Rows object. Returns the schema of rows in the group. Returns the list of sorting columns describing how rows are sorted in the group. The method will return an empty slice if the rows are not sorted. (*MergedRowGroup) String() string *MergedRowGroup : DynamicRowGroup MergedRowGroup : github.com/polarsignals/frostdb/query/expr.Particulate MergedRowGroup : github.com/parquet-go/parquet-go.RowGroup *MergedRowGroup : expvar.Var *MergedRowGroup : fmt.Stringer
func WithAlreadySorted() MergeOption func WithDynamicCols(cols map[string][]string) MergeOption func (*Schema).MergeDynamicRowGroups(rowGroups []DynamicRowGroup, options ...MergeOption) (DynamicRowGroup, error)
NilColumnChunk is a column chunk that contains a single page with all null values of the given type, given length and column index of the parent schema. It implements the parquet.ColumnChunk interface. BloomFilter returns the bloomfilter of the column chunk. Since the NilColumnChunk is a virtual column chunk only for in-memory purposes, it returns nil. Implements the parquet.ColumnChunk interface. Type returns the index of the column chunk within the parent schema. Implements the parquet.ColumnChunk interface. ColumnIndex returns the column index of the column chunk. Since the NilColumnChunk is a virtual column chunk only for in-memory purposes, it returns nil. Implements the parquet.ColumnChunk interface. NumValues returns the number of values in the column chunk. Implements the parquet.ColumnChunk interface. OffsetIndex returns the offset index of the column chunk. Since the NilColumnChunk is a virtual column chunk only for in-memory purposes, it returns nil. Implements the parquet.ColumnChunk interface. Pages returns an iterator for all pages within the column chunk. This iterator will only ever return a single page filled with all null values of the configured amount. Implements the parquet.ColumnChunk interface. Type returns the type of the column chunk. Implements the parquet.ColumnChunk interface. *NilColumnChunk : github.com/parquet-go/parquet-go.ColumnChunk func NewNilColumnChunk(typ parquet.Type, columnIndex, numValues int) *NilColumnChunk
( ParquetWriter) Close() error ( ParquetWriter) Flush() error ( ParquetWriter) Reset(writer io.Writer) ( ParquetWriter) Schema() *parquet.Schema ( ParquetWriter) Write(rows []any) (int, error) ( ParquetWriter) WriteRows(rows []parquet.Row) (int, error) PooledWriter ParquetWriter : github.com/polarsignals/frostdb.ParquetWriter ParquetWriter : github.com/apache/thrift/lib/go/thrift.Flusher ParquetWriter : github.com/parquet-go/parquet-go.RowWriter ParquetWriter : github.com/parquet-go/parquet-go.RowWriterWithSchema ParquetWriter : github.com/prometheus/common/expfmt.Closer ParquetWriter : io.Closer func (*Schema).NewWriter(w io.Writer, dynamicColumns map[string][]string, sorting bool, options ...parquet.WriterOption) (ParquetWriter, error) func github.com/polarsignals/frostdb/parts.Part.SerializeBuffer(schema *Schema, w ParquetWriter) error func github.com/polarsignals/frostdb/pqarrow.RecordsToFile(schema *Schema, w ParquetWriter, recs []arrow.Record) error func github.com/polarsignals/frostdb/pqarrow.RecordToFile(schema *Schema, w ParquetWriter, r arrow.Record) error
Buffer *Buffer ( PooledBuffer) Clone() (*Buffer, error) ColumnChunks returns the list of parquet.ColumnChunk for the given index. It contains all the pages associated with this row group's column. Implements the parquet.RowGroup interface. DynamicColumns returns the concrete dynamic column names of the buffer. It implements the DynamicRowGroup interface. DynamicRows returns an iterator for the rows in the buffer. It implements the DynamicRowGroup interface. NumRows returns the number of rows in the buffer. Implements the parquet.RowGroup interface. ( PooledBuffer) Reset() Rows returns an iterator for the rows in the buffer. It implements the parquet.RowGroup interface. Schema returns the concrete parquet.Schema of the buffer. Implements the parquet.RowGroup interface. ( PooledBuffer) Size() int64 ( PooledBuffer) Sort() SortingColumns returns the concrete slice of parquet.SortingColumns of the buffer. Implements the parquet.RowGroup interface. ( PooledBuffer) String() string WriteRowGroup writes a single row to the buffer. WriteRow writes a single row to the buffer. PooledBuffer : DynamicRowGroup PooledBuffer : github.com/polarsignals/frostdb/query/expr.Particulate PooledBuffer : github.com/parquet-go/parquet-go.RowGroup PooledBuffer : github.com/parquet-go/parquet-go.RowGroupWriter PooledBuffer : github.com/parquet-go/parquet-go.RowWriter PooledBuffer : github.com/parquet-go/parquet-go.RowWriterWithSchema PooledBuffer : expvar.Var PooledBuffer : fmt.Stringer func (*Schema).GetBuffer(dynamicColumns map[string][]string) (*PooledBuffer, error) func (*Schema).PutBuffer(b *PooledBuffer)
Schema *parquet.Schema func (*Schema).GetDynamicParquetSchema(dynamicColumns map[string][]string) (*PooledParquetSchema, error) func (*Schema).GetParquetSortingSchema(dynamicColumns map[string][]string) (*PooledParquetSchema, error) func (*Schema).PutPooledParquetSchema(ps *PooledParquetSchema)
ParquetWriter ParquetWriter ( PooledWriter) Close() error ( PooledWriter) Flush() error ( PooledWriter) Reset(writer io.Writer) ( PooledWriter) Schema() *parquet.Schema ( PooledWriter) Write(rows []any) (int, error) ( PooledWriter) WriteRows(rows []parquet.Row) (int, error) PooledWriter : ParquetWriter PooledWriter : github.com/polarsignals/frostdb.ParquetWriter PooledWriter : github.com/apache/thrift/lib/go/thrift.Flusher PooledWriter : github.com/parquet-go/parquet-go.RowWriter PooledWriter : github.com/parquet-go/parquet-go.RowWriterWithSchema PooledWriter : github.com/prometheus/common/expfmt.Closer PooledWriter : io.Closer func (*Schema).GetWriter(w io.Writer, dynamicColumns map[string][]string, sorting bool) (*PooledWriter, error) func (*Schema).PutWriter(w *PooledWriter)
type Sample = samples.Sample (struct)
type Samples = samples.Samples ([])
Schema is a dynamic parquet schema. It extends a parquet schema with the ability that any column definition that is dynamic will have columns dynamically created as their column name is seen for the first time. UniquePrimaryIndex bool (*Schema) Cmp(a, b *DynamicRow) int (*Schema) ColumnByName(name string) (ColumnDefinition, bool) (*Schema) ColumnDefinitionsForSortingColumns() []ColumnDefinition (*Schema) Columns() []ColumnDefinition (*Schema) Definition() proto.Message FindColumn returns a column definition for the column passed. FindDynamicColumn returns a dynamic column definition for the column passed. FindDynamicColumnForConcreteColumn returns a column definition for the column passed. So "labels.label1" would return the column definition for the dynamic column "labels" if it exists. (*Schema) GetBuffer(dynamicColumns map[string][]string) (*PooledBuffer, error) GetDynamicParquetSchema returns a parquet schema of the all columns and the given dynamic columns. The difference with GetParquetSortingSchema is that all columns are included in the parquet schema. GetParquetSortingSchema returns a parquet schema of the sorting columns and the given dynamic columns. The difference with GetDynamicParquetSchema is that non-sorting columns are elided. (*Schema) GetWriter(w io.Writer, dynamicColumns map[string][]string, sorting bool) (*PooledWriter, error) MergeDynamicRowGroups merges the given dynamic row groups into a single dynamic row group. It merges the parquet schema in a non-conflicting way by merging all the concrete dynamic column names and generating a superset parquet schema that all given dynamic row groups are compatible with. (*Schema) Name() string NewBuffer returns a new buffer with a concrete parquet schema generated using the given concrete dynamic column names. (*Schema) NewBufferV2(dynamicColumns ...*schemav2pb.Node) (*Buffer, error) NewWriter returns a new parquet writer with a concrete parquet schema generated using the given concrete dynamic column names. (*Schema) ParquetSchema() *parquet.Schema ParquetSortingColumns returns the parquet sorting columns for the dynamic sorting columns with the concrete dynamic column names given in the argument. (*Schema) PutBuffer(b *PooledBuffer) (*Schema) PutPooledParquetSchema(ps *PooledParquetSchema) (*Schema) PutWriter(w *PooledWriter) (*Schema) ResetBuffers() (*Schema) ResetWriters() (*Schema) RowLessThan(a, b *DynamicRow) bool (*Schema) SerializeBuffer(w io.Writer, buffer *Buffer) error (*Schema) SortingColumns() []SortingColumn *Schema : github.com/polarsignals/frostdb/query/logicalplan.Named func NewSampleSchema() *Schema func SchemaFromDefinition(msg proto.Message) (*Schema, error) func SchemaFromParquetFile(file *parquet.File) (*Schema, error) func github.com/polarsignals/frostdb.(*Table).Schema() *Schema func github.com/polarsignals/frostdb/query/logicalplan.(*LogicalPlan).InputSchema() *Schema func github.com/polarsignals/frostdb/query/logicalplan.TableReader.Schema() *Schema func NewDynamicRowSorter(schema *Schema, rows *DynamicRows) *DynamicRowSorter func PrehashColumns(schema *Schema, r arrow.Record) arrow.Record func ToBuffer(s Samples, schema *Schema) (*Buffer, error) func (*DynamicRows).IsSorted(schema *Schema) bool func github.com/polarsignals/frostdb.DataSinkSource.Scan(ctx context.Context, prefix string, schema *Schema, filter logicalplan.Expr, lastBlockTimestamp uint64, callback func(context.Context, any) error) error func github.com/polarsignals/frostdb.DataSource.Scan(ctx context.Context, prefix string, schema *Schema, filter logicalplan.Expr, lastBlockTimestamp uint64, callback func(context.Context, any) error) error func github.com/polarsignals/frostdb.(*DefaultObjstoreBucket).Scan(ctx context.Context, prefix string, _ *Schema, filter logicalplan.Expr, lastBlockTimestamp uint64, callback func(context.Context, any) error) error func github.com/polarsignals/frostdb/index.NewLSM(dir string, schema *Schema, levels []*index.LevelConfig, watermark func() uint64, options ...index.LSMOption) (*index.LSM, error) func github.com/polarsignals/frostdb/index.(*LSM).Scan(ctx context.Context, _ string, _ *Schema, filter logicalplan.Expr, tx uint64, callback func(context.Context, any) error) error func github.com/polarsignals/frostdb/parts.FindMaximumNonOverlappingSet(schema *Schema, parts []parts.Part) ([]parts.Part, []parts.Part, error) func github.com/polarsignals/frostdb/parts.NewArrowPart(tx uint64, record arrow.Record, size uint64, schema *Schema, options ...parts.Option) parts.Part func github.com/polarsignals/frostdb/parts.NewPartSorter(schema *Schema, parts []parts.Part) *parts.PartSorter func github.com/polarsignals/frostdb/parts.Part.AsSerializedBuffer(schema *Schema) (*SerializedBuffer, error) func github.com/polarsignals/frostdb/parts.Part.OverlapsWith(schema *Schema, otherPart parts.Part) (bool, error) func github.com/polarsignals/frostdb/parts.Part.SerializeBuffer(schema *Schema, w ParquetWriter) error func github.com/polarsignals/frostdb/pqarrow.ParquetRowGroupToArrowSchema(ctx context.Context, rg parquet.RowGroup, s *Schema, options logicalplan.IterOptions) (*arrow.Schema, error) func github.com/polarsignals/frostdb/pqarrow.ParquetSchemaToArrowSchema(ctx context.Context, schema *parquet.Schema, s *Schema, options logicalplan.IterOptions) (*arrow.Schema, error) func github.com/polarsignals/frostdb/pqarrow.RecordsToFile(schema *Schema, w ParquetWriter, recs []arrow.Record) error func github.com/polarsignals/frostdb/pqarrow.RecordToFile(schema *Schema, w ParquetWriter, r arrow.Record) error func github.com/polarsignals/frostdb/pqarrow.SerializeRecord(r arrow.Record, schema *Schema) (*SerializedBuffer, error) func github.com/polarsignals/frostdb/pqarrow.(*ParquetConverter).Convert(ctx context.Context, rg parquet.RowGroup, s *Schema) error func github.com/polarsignals/frostdb/query/logicalplan.DataTypeForExprWithSchema(expr logicalplan.Expr, s *Schema) (arrow.DataType, error) func github.com/polarsignals/frostdb/query/physicalplan.Build(ctx context.Context, pool memory.Allocator, tracer trace.Tracer, s *Schema, plan *logicalplan.LogicalPlan, options ...physicalplan.Option) (*physicalplan.OutputPlan, error) func github.com/polarsignals/frostdb/storage.(*Iceberg).Scan(ctx context.Context, prefix string, _ *Schema, filter logicalplan.Expr, _ uint64, callback func(context.Context, any) error) error
(*SerializedBuffer) DynamicColumns() map[string][]string (*SerializedBuffer) DynamicRowGroup(i int) DynamicRowGroup (*SerializedBuffer) DynamicRows() DynamicRowReader MultiDynamicRowGroup returns all the row groups wrapped in a single multi row group. (*SerializedBuffer) NumRowGroups() int (*SerializedBuffer) NumRows() int64 (*SerializedBuffer) ParquetFile() *parquet.File (*SerializedBuffer) Reader() *parquet.GenericReader[any] (*SerializedBuffer) String() string *SerializedBuffer : expvar.Var *SerializedBuffer : fmt.Stringer func NewSerializedBuffer(f *parquet.File) (*SerializedBuffer, error) func ReaderFromBytes(buf []byte) (*SerializedBuffer, error) func github.com/polarsignals/frostdb/parts.Part.AsSerializedBuffer(schema *Schema) (*SerializedBuffer, error) func github.com/polarsignals/frostdb/pqarrow.SerializeRecord(r arrow.Record, schema *Schema) (*SerializedBuffer, error) func github.com/polarsignals/frostdb/parts.NewParquetPart(tx uint64, buf *SerializedBuffer, options ...parts.Option) parts.Part
SortingColumn describes a column to sort by in a dynamic parquet schema. ( SortingColumn) ColumnName() string Returns true if the column will sort values in descending order. Returns true if the column will put null values at the beginning. Returns the path of the column in the row group schema, omitting the name of the root node. SortingColumn : github.com/parquet-go/parquet-go.SortingColumn func Ascending(column string) SortingColumn func Descending(column string) SortingColumn func NullsFirst(sortingColumn SortingColumn) SortingColumn func (*Schema).SortingColumns() []SortingColumn func NullsFirst(sortingColumn SortingColumn) SortingColumn
( StorageLayout) GetCompressionInt32() int32 ( StorageLayout) GetEncodingInt32() int32 ( StorageLayout) GetNullable() bool ( StorageLayout) GetRepeated() bool ( StorageLayout) GetTypeInt32() int32 func StorageLayoutWrapper(_ *schemav2pb.StorageLayout) StorageLayout
Package-Level Functions (total 33)
Ascending constructs a SortingColumn value which dictates to sort by the column in ascending order.
DefinitionFromParquetFile converts a parquet file into a schemapb.Schema.
Descending constructs a SortingColumn value which dictates to sort by the column in descending order.
func FindChildIndex(fields []parquet.Field, name string) int
findHashedColumn finds the index of the column in the given fields that have been prehashed.
MergeDeduplicatedDynCols is a light wrapper over sorting the deduplicated dynamic column names provided in dyn. It is extracted as a public method since this merging determines the order in which dynamic columns are stored and components from other packages sometimes need to figure out the physical sort order between dynamic columns.
func NewDynamicRow(row parquet.Row, schema *parquet.Schema, dyncols map[string][]string, fields []parquet.Field) *DynamicRow
NewDynamicRowGroupMergeAdapter returns a *DynamicRowGroupMergeAdapter, which maps the columns of the original row group to the columns in the super-set schema provided. This allows row groups that have non-conflicting dynamic schemas to be merged into a single row group with a superset parquet schema. The provided schema must not conflict with the original row group's schema it must be strictly a superset, this property is not checked, it is assumed to be true for performance reasons.
func NewDynamicRows(rows []parquet.Row, schema *parquet.Schema, dynamicColumns map[string][]string, fields []parquet.Field) *DynamicRows
NewNilColumnChunk creates a new column chunk configured with the given type, column index and number of values in the page.
NullsFirst wraps the SortingColumn passed as argument so that it instructs the row group to place null values first in the column.
prehashColumns prehashes the columns in the given record that have been marked as prehashed in the given schema.
RemoveHashedColumns removes the hashed columns from the record.
SchemaFromParquetFile converts a parquet file into a dnyparquet.Schema.
func ToBuffer(s Samples, schema *Schema) (*Buffer, error)
Package-Level Constants (total 2)
The size of the column indicies in parquet files.
const DynamicColumnsKey = "dynamic_columns"