package dynparquet
Import Path
github.com/polarsignals/frostdb/dynparquet (on go.dev)
Dependency Relation
imports 25 packages, and imported by 7 packages
Involved Source Files
concat.go
dynamiccolumns.go
example.go
hashed.go
nil_chunk.go
reader.go
row.go
schema.go
Package-Level Type Names (total 21)
Buffer represents an batch of rows with a concrete set of dynamic column
names representing how its parquet schema was created off of a dynamic
parquet schema.
(*Buffer) Clone() (*Buffer, error)
ColumnChunks returns the list of parquet.ColumnChunk for the given index.
It contains all the pages associated with this row group's column.
Implements the parquet.RowGroup interface.
DynamicColumns returns the concrete dynamic column names of the buffer. It
implements the DynamicRowGroup interface.
DynamicRows returns an iterator for the rows in the buffer. It implements the
DynamicRowGroup interface.
NumRows returns the number of rows in the buffer. Implements the
parquet.RowGroup interface.
(*Buffer) Reset()
Rows returns an iterator for the rows in the buffer. It implements the
parquet.RowGroup interface.
Schema returns the concrete parquet.Schema of the buffer. Implements the
parquet.RowGroup interface.
(*Buffer) Size() int64
(*Buffer) Sort()
SortingColumns returns the concrete slice of parquet.SortingColumns of the
buffer. Implements the parquet.RowGroup interface.
(*Buffer) String() string
WriteRowGroup writes a single row to the buffer.
WriteRow writes a single row to the buffer.
*Buffer : DynamicRowGroup
*Buffer : github.com/polarsignals/frostdb/query/expr.Particulate
*Buffer : github.com/parquet-go/parquet-go.RowGroup
*Buffer : github.com/parquet-go/parquet-go.RowGroupWriter
*Buffer : github.com/parquet-go/parquet-go.RowWriter
*Buffer : github.com/parquet-go/parquet-go.RowWriterWithSchema
*Buffer : expvar.Var
*Buffer : fmt.Stringer
func ToBuffer(s Samples, schema *Schema) (*Buffer, error)
func (*Buffer).Clone() (*Buffer, error)
func (*Schema).NewBuffer(dynamicColumns map[string][]string) (*Buffer, error)
func (*Schema).NewBufferV2(dynamicColumns ...*schemav2pb.Node) (*Buffer, error)
func (*Schema).SerializeBuffer(w io.Writer, buffer *Buffer) error
ColumnDefinition describes a column in a dynamic parquet schema.
Dynamic bool
Name string
PreHash bool
StorageLayout parquet.Node
func (*Schema).ColumnByName(name string) (ColumnDefinition, bool)
func (*Schema).ColumnDefinitionsForSortingColumns() []ColumnDefinition
func (*Schema).Columns() []ColumnDefinition
func (*Schema).FindColumn(column string) (ColumnDefinition, bool)
func (*Schema).FindDynamicColumn(dynamicColumnName string) (ColumnDefinition, bool)
func (*Schema).FindDynamicColumnForConcreteColumn(column string) (ColumnDefinition, bool)
DynamicColumns map[string][]string
Row parquet.Row
Schema *parquet.Schema
func NewDynamicRow(row parquet.Row, schema *parquet.Schema, dyncols map[string][]string, fields []parquet.Field) *DynamicRow
func (*DynamicRows).Get(i int) *DynamicRow
func (*DynamicRows).GetCopy(i int) *DynamicRow
func github.com/polarsignals/frostdb/parts.Part.Least() (*DynamicRow, error)
func github.com/polarsignals/frostdb/parts.Part.Most() (*DynamicRow, error)
func github.com/polarsignals/frostdb/pqarrow.RecordToDynamicRow(pqSchema *parquet.Schema, record arrow.Record, dyncols map[string][]string, index int) (*DynamicRow, error)
func (*Schema).Cmp(a, b *DynamicRow) int
func (*Schema).RowLessThan(a, b *DynamicRow) bool
DynamicRowGroup is a parquet.RowGroup that can describe the concrete dynamic
columns.
Returns the list of column chunks in this row group. The chunks are
ordered in the order of leaf columns from the row group's schema.
If the underlying implementation is not read-only, the returned
parquet.ColumnChunk may implement other interfaces: for example,
parquet.ColumnBuffer if the chunk is backed by an in-memory buffer,
or typed writer interfaces like parquet.Int32Writer depending on the
underlying type of values that can be written to the chunk.
As an optimization, the row group may return the same slice across
multiple calls to this method. Applications should treat the returned
slice as read-only.
DynamicColumns returns the concrete dynamic column names that were used
create its concrete parquet schema with a dynamic parquet schema.
DynamicRows return an iterator over the rows in the row group.
Returns the number of rows in the group.
Returns a reader exposing the rows of the row group.
As an optimization, the returned parquet.Rows object may implement
parquet.RowWriterTo, and test the RowWriter it receives for an
implementation of the parquet.RowGroupWriter interface.
This optimization mechanism is leveraged by the parquet.CopyRows function
to skip the generic row-by-row copy algorithm and delegate the copy logic
to the parquet.Rows object.
Returns the schema of rows in the group.
Returns the list of sorting columns describing how rows are sorted in the
group.
The method will return an empty slice if the rows are not sorted.
( DynamicRowGroup) String() string
*Buffer
*MergedRowGroup
PooledBuffer
github.com/polarsignals/frostdb/index.ReleaseableRowGroup (interface)
DynamicRowGroup : github.com/polarsignals/frostdb/query/expr.Particulate
DynamicRowGroup : github.com/parquet-go/parquet-go.RowGroup
DynamicRowGroup : expvar.Var
DynamicRowGroup : fmt.Stringer
func Concat(fields []parquet.Field, drg ...DynamicRowGroup) DynamicRowGroup
func (*Schema).MergeDynamicRowGroups(rowGroups []DynamicRowGroup, options ...MergeOption) (DynamicRowGroup, error)
func (*SerializedBuffer).DynamicRowGroup(i int) DynamicRowGroup
func (*SerializedBuffer).MultiDynamicRowGroup() DynamicRowGroup
func Concat(fields []parquet.Field, drg ...DynamicRowGroup) DynamicRowGroup
func (*Schema).MergeDynamicRowGroups(rowGroups []DynamicRowGroup, options ...MergeOption) (DynamicRowGroup, error)
DynamicRowGroupMergeAdapter maps a RowBatch with a Schema with a subset of dynamic
columns to a Schema with a superset of dynamic columns. It implements the
parquet.RowGroup interface.
Returns the leaf column at the given index in the group. Searches for the
same column in the original batch. If not found returns a column chunk
filled with nulls.
Returns the number of rows in the group.
Returns a reader exposing the rows of the row group.
Returns the schema of rows in the group. The schema is the configured
merged, superset schema.
Returns the list of sorting columns describing how rows are sorted in the
group.
The method will return an empty slice if the rows are not sorted.
*DynamicRowGroupMergeAdapter : github.com/polarsignals/frostdb/query/expr.Particulate
*DynamicRowGroupMergeAdapter : github.com/parquet-go/parquet-go.RowGroup
func NewDynamicRowGroupMergeAdapter(schema *parquet.Schema, sortingColumns []parquet.SortingColumn, mergedDynamicColumns map[string][]string, originalRowGroup parquet.RowGroup) *DynamicRowGroupMergeAdapter
DynamicRowReader is an iterator over the rows in a DynamicRowGroup.
( DynamicRowReader) Close() error
( DynamicRowReader) ReadRows(*DynamicRows) (int, error)
Positions the stream on the given row index.
Some implementations of the interface may only allow seeking forward.
The method returns io.ErrClosedPipe if the stream had already been closed.
DynamicRowReader : github.com/parquet-go/parquet-go.RowSeeker
DynamicRowReader : github.com/prometheus/common/expfmt.Closer
DynamicRowReader : io.Closer
func (*Buffer).DynamicRows() DynamicRowReader
func DynamicRowGroup.DynamicRows() DynamicRowReader
func (*MergedRowGroup).DynamicRows() DynamicRowReader
func (*SerializedBuffer).DynamicRows() DynamicRowReader
func github.com/polarsignals/frostdb/index.ReleaseableRowGroup.DynamicRows() DynamicRowReader
DynamicColumns map[string][]string
Rows []parquet.Row
Schema *parquet.Schema
(*DynamicRows) Get(i int) *DynamicRow
(*DynamicRows) GetCopy(i int) *DynamicRow
(*DynamicRows) IsSorted(schema *Schema) bool
func NewDynamicRows(rows []parquet.Row, schema *parquet.Schema, dynamicColumns map[string][]string, fields []parquet.Field) *DynamicRows
func NewDynamicRowSorter(schema *Schema, rows *DynamicRows) *DynamicRowSorter
func DynamicRowReader.ReadRows(*DynamicRows) (int, error)
(*DynamicRowSorter) Len() int
(*DynamicRowSorter) Less(i, j int) bool
(*DynamicRowSorter) Swap(i, j int)
*DynamicRowSorter : sort.Interface
func NewDynamicRowSorter(schema *Schema, rows *DynamicRows) *DynamicRowSorter
MergedRowGroup allows wrapping any parquet.RowGroup to implement the
DynamicRowGroup interface by specifying the concrete dynamic column names
the RowGroup's schema contains.
DynCols map[string][]string
RowGroup parquet.RowGroup
Returns the list of column chunks in this row group. The chunks are
ordered in the order of leaf columns from the row group's schema.
If the underlying implementation is not read-only, the returned
parquet.ColumnChunk may implement other interfaces: for example,
parquet.ColumnBuffer if the chunk is backed by an in-memory buffer,
or typed writer interfaces like parquet.Int32Writer depending on the
underlying type of values that can be written to the chunk.
As an optimization, the row group may return the same slice across
multiple calls to this method. Applications should treat the returned
slice as read-only.
DynamicColumns returns the concrete dynamic column names that were used
create its concrete parquet schema with a dynamic parquet schema. Implements
the DynamicRowGroup interface.
DynamicRows returns an iterator over the rows in the row group. Implements
the DynamicRowGroup interface.
Returns the number of rows in the group.
Returns a reader exposing the rows of the row group.
As an optimization, the returned parquet.Rows object may implement
parquet.RowWriterTo, and test the RowWriter it receives for an
implementation of the parquet.RowGroupWriter interface.
This optimization mechanism is leveraged by the parquet.CopyRows function
to skip the generic row-by-row copy algorithm and delegate the copy logic
to the parquet.Rows object.
Returns the schema of rows in the group.
Returns the list of sorting columns describing how rows are sorted in the
group.
The method will return an empty slice if the rows are not sorted.
(*MergedRowGroup) String() string
*MergedRowGroup : DynamicRowGroup
MergedRowGroup : github.com/polarsignals/frostdb/query/expr.Particulate
MergedRowGroup : github.com/parquet-go/parquet-go.RowGroup
*MergedRowGroup : expvar.Var
*MergedRowGroup : fmt.Stringer
func WithAlreadySorted() MergeOption
func WithDynamicCols(cols map[string][]string) MergeOption
func (*Schema).MergeDynamicRowGroups(rowGroups []DynamicRowGroup, options ...MergeOption) (DynamicRowGroup, error)
NilColumnChunk is a column chunk that contains a single page with all null
values of the given type, given length and column index of the parent
schema. It implements the parquet.ColumnChunk interface.
BloomFilter returns the bloomfilter of the column chunk. Since the
NilColumnChunk is a virtual column chunk only for in-memory purposes, it
returns nil. Implements the parquet.ColumnChunk interface.
Type returns the index of the column chunk within the parent schema.
Implements the parquet.ColumnChunk interface.
ColumnIndex returns the column index of the column chunk. Since the
NilColumnChunk is a virtual column chunk only for in-memory purposes, it
returns nil. Implements the parquet.ColumnChunk interface.
NumValues returns the number of values in the column chunk. Implements the
parquet.ColumnChunk interface.
OffsetIndex returns the offset index of the column chunk. Since the
NilColumnChunk is a virtual column chunk only for in-memory purposes, it
returns nil. Implements the parquet.ColumnChunk interface.
Pages returns an iterator for all pages within the column chunk. This
iterator will only ever return a single page filled with all null values of
the configured amount. Implements the parquet.ColumnChunk interface.
Type returns the type of the column chunk. Implements the
parquet.ColumnChunk interface.
*NilColumnChunk : github.com/parquet-go/parquet-go.ColumnChunk
func NewNilColumnChunk(typ parquet.Type, columnIndex, numValues int) *NilColumnChunk
( ParquetWriter) Close() error
( ParquetWriter) Flush() error
( ParquetWriter) Reset(writer io.Writer)
( ParquetWriter) Schema() *parquet.Schema
( ParquetWriter) Write(rows []any) (int, error)
( ParquetWriter) WriteRows(rows []parquet.Row) (int, error)
PooledWriter
ParquetWriter : github.com/polarsignals/frostdb.ParquetWriter
ParquetWriter : github.com/apache/thrift/lib/go/thrift.Flusher
ParquetWriter : github.com/parquet-go/parquet-go.RowWriter
ParquetWriter : github.com/parquet-go/parquet-go.RowWriterWithSchema
ParquetWriter : github.com/prometheus/common/expfmt.Closer
ParquetWriter : io.Closer
func (*Schema).NewWriter(w io.Writer, dynamicColumns map[string][]string, sorting bool, options ...parquet.WriterOption) (ParquetWriter, error)
func github.com/polarsignals/frostdb/parts.Part.SerializeBuffer(schema *Schema, w ParquetWriter) error
func github.com/polarsignals/frostdb/pqarrow.RecordsToFile(schema *Schema, w ParquetWriter, recs []arrow.Record) error
func github.com/polarsignals/frostdb/pqarrow.RecordToFile(schema *Schema, w ParquetWriter, r arrow.Record) error
Buffer *Buffer
( PooledBuffer) Clone() (*Buffer, error)
ColumnChunks returns the list of parquet.ColumnChunk for the given index.
It contains all the pages associated with this row group's column.
Implements the parquet.RowGroup interface.
DynamicColumns returns the concrete dynamic column names of the buffer. It
implements the DynamicRowGroup interface.
DynamicRows returns an iterator for the rows in the buffer. It implements the
DynamicRowGroup interface.
NumRows returns the number of rows in the buffer. Implements the
parquet.RowGroup interface.
( PooledBuffer) Reset()
Rows returns an iterator for the rows in the buffer. It implements the
parquet.RowGroup interface.
Schema returns the concrete parquet.Schema of the buffer. Implements the
parquet.RowGroup interface.
( PooledBuffer) Size() int64
( PooledBuffer) Sort()
SortingColumns returns the concrete slice of parquet.SortingColumns of the
buffer. Implements the parquet.RowGroup interface.
( PooledBuffer) String() string
WriteRowGroup writes a single row to the buffer.
WriteRow writes a single row to the buffer.
PooledBuffer : DynamicRowGroup
PooledBuffer : github.com/polarsignals/frostdb/query/expr.Particulate
PooledBuffer : github.com/parquet-go/parquet-go.RowGroup
PooledBuffer : github.com/parquet-go/parquet-go.RowGroupWriter
PooledBuffer : github.com/parquet-go/parquet-go.RowWriter
PooledBuffer : github.com/parquet-go/parquet-go.RowWriterWithSchema
PooledBuffer : expvar.Var
PooledBuffer : fmt.Stringer
func (*Schema).GetBuffer(dynamicColumns map[string][]string) (*PooledBuffer, error)
func (*Schema).PutBuffer(b *PooledBuffer)
Schema *parquet.Schema
func (*Schema).GetDynamicParquetSchema(dynamicColumns map[string][]string) (*PooledParquetSchema, error)
func (*Schema).GetParquetSortingSchema(dynamicColumns map[string][]string) (*PooledParquetSchema, error)
func (*Schema).PutPooledParquetSchema(ps *PooledParquetSchema)
ParquetWriter ParquetWriter
( PooledWriter) Close() error
( PooledWriter) Flush() error
( PooledWriter) Reset(writer io.Writer)
( PooledWriter) Schema() *parquet.Schema
( PooledWriter) Write(rows []any) (int, error)
( PooledWriter) WriteRows(rows []parquet.Row) (int, error)
PooledWriter : ParquetWriter
PooledWriter : github.com/polarsignals/frostdb.ParquetWriter
PooledWriter : github.com/apache/thrift/lib/go/thrift.Flusher
PooledWriter : github.com/parquet-go/parquet-go.RowWriter
PooledWriter : github.com/parquet-go/parquet-go.RowWriterWithSchema
PooledWriter : github.com/prometheus/common/expfmt.Closer
PooledWriter : io.Closer
func (*Schema).GetWriter(w io.Writer, dynamicColumns map[string][]string, sorting bool) (*PooledWriter, error)
func (*Schema).PutWriter(w *PooledWriter)
Schema is a dynamic parquet schema. It extends a parquet schema with the
ability that any column definition that is dynamic will have columns
dynamically created as their column name is seen for the first time.
UniquePrimaryIndex bool
(*Schema) Cmp(a, b *DynamicRow) int
(*Schema) ColumnByName(name string) (ColumnDefinition, bool)
(*Schema) ColumnDefinitionsForSortingColumns() []ColumnDefinition
(*Schema) Columns() []ColumnDefinition
(*Schema) Definition() proto.Message
FindColumn returns a column definition for the column passed.
FindDynamicColumn returns a dynamic column definition for the column passed.
FindDynamicColumnForConcreteColumn returns a column definition for the
column passed. So "labels.label1" would return the column definition for the
dynamic column "labels" if it exists.
(*Schema) GetBuffer(dynamicColumns map[string][]string) (*PooledBuffer, error)
GetDynamicParquetSchema returns a parquet schema of the all columns and
the given dynamic columns.
The difference with GetParquetSortingSchema is that all columns are included
in the parquet schema.
GetParquetSortingSchema returns a parquet schema of the sorting columns and
the given dynamic columns.
The difference with GetDynamicParquetSchema is that non-sorting columns are elided.
(*Schema) GetWriter(w io.Writer, dynamicColumns map[string][]string, sorting bool) (*PooledWriter, error)
MergeDynamicRowGroups merges the given dynamic row groups into a single
dynamic row group. It merges the parquet schema in a non-conflicting way by
merging all the concrete dynamic column names and generating a superset
parquet schema that all given dynamic row groups are compatible with.
(*Schema) Name() string
NewBuffer returns a new buffer with a concrete parquet schema generated
using the given concrete dynamic column names.
(*Schema) NewBufferV2(dynamicColumns ...*schemav2pb.Node) (*Buffer, error)
NewWriter returns a new parquet writer with a concrete parquet schema
generated using the given concrete dynamic column names.
(*Schema) ParquetSchema() *parquet.Schema
ParquetSortingColumns returns the parquet sorting columns for the dynamic
sorting columns with the concrete dynamic column names given in the
argument.
(*Schema) PutBuffer(b *PooledBuffer)
(*Schema) PutPooledParquetSchema(ps *PooledParquetSchema)
(*Schema) PutWriter(w *PooledWriter)
(*Schema) ResetBuffers()
(*Schema) ResetWriters()
(*Schema) RowLessThan(a, b *DynamicRow) bool
(*Schema) SerializeBuffer(w io.Writer, buffer *Buffer) error
(*Schema) SortingColumns() []SortingColumn
*Schema : github.com/polarsignals/frostdb/query/logicalplan.Named
func NewSampleSchema() *Schema
func SchemaFromDefinition(msg proto.Message) (*Schema, error)
func SchemaFromParquetFile(file *parquet.File) (*Schema, error)
func github.com/polarsignals/frostdb.(*Table).Schema() *Schema
func github.com/polarsignals/frostdb/query/logicalplan.(*LogicalPlan).InputSchema() *Schema
func github.com/polarsignals/frostdb/query/logicalplan.TableReader.Schema() *Schema
func NewDynamicRowSorter(schema *Schema, rows *DynamicRows) *DynamicRowSorter
func PrehashColumns(schema *Schema, r arrow.Record) arrow.Record
func ToBuffer(s Samples, schema *Schema) (*Buffer, error)
func (*DynamicRows).IsSorted(schema *Schema) bool
func github.com/polarsignals/frostdb.DataSinkSource.Scan(ctx context.Context, prefix string, schema *Schema, filter logicalplan.Expr, lastBlockTimestamp uint64, callback func(context.Context, any) error) error
func github.com/polarsignals/frostdb.DataSource.Scan(ctx context.Context, prefix string, schema *Schema, filter logicalplan.Expr, lastBlockTimestamp uint64, callback func(context.Context, any) error) error
func github.com/polarsignals/frostdb.(*DefaultObjstoreBucket).Scan(ctx context.Context, prefix string, _ *Schema, filter logicalplan.Expr, lastBlockTimestamp uint64, callback func(context.Context, any) error) error
func github.com/polarsignals/frostdb/index.NewLSM(dir string, schema *Schema, levels []*index.LevelConfig, watermark func() uint64, options ...index.LSMOption) (*index.LSM, error)
func github.com/polarsignals/frostdb/index.(*LSM).Scan(ctx context.Context, _ string, _ *Schema, filter logicalplan.Expr, tx uint64, callback func(context.Context, any) error) error
func github.com/polarsignals/frostdb/parts.FindMaximumNonOverlappingSet(schema *Schema, parts []parts.Part) ([]parts.Part, []parts.Part, error)
func github.com/polarsignals/frostdb/parts.NewArrowPart(tx uint64, record arrow.Record, size uint64, schema *Schema, options ...parts.Option) parts.Part
func github.com/polarsignals/frostdb/parts.NewPartSorter(schema *Schema, parts []parts.Part) *parts.PartSorter
func github.com/polarsignals/frostdb/parts.Part.AsSerializedBuffer(schema *Schema) (*SerializedBuffer, error)
func github.com/polarsignals/frostdb/parts.Part.OverlapsWith(schema *Schema, otherPart parts.Part) (bool, error)
func github.com/polarsignals/frostdb/parts.Part.SerializeBuffer(schema *Schema, w ParquetWriter) error
func github.com/polarsignals/frostdb/pqarrow.ParquetRowGroupToArrowSchema(ctx context.Context, rg parquet.RowGroup, s *Schema, options logicalplan.IterOptions) (*arrow.Schema, error)
func github.com/polarsignals/frostdb/pqarrow.ParquetSchemaToArrowSchema(ctx context.Context, schema *parquet.Schema, s *Schema, options logicalplan.IterOptions) (*arrow.Schema, error)
func github.com/polarsignals/frostdb/pqarrow.RecordsToFile(schema *Schema, w ParquetWriter, recs []arrow.Record) error
func github.com/polarsignals/frostdb/pqarrow.RecordToFile(schema *Schema, w ParquetWriter, r arrow.Record) error
func github.com/polarsignals/frostdb/pqarrow.SerializeRecord(r arrow.Record, schema *Schema) (*SerializedBuffer, error)
func github.com/polarsignals/frostdb/pqarrow.(*ParquetConverter).Convert(ctx context.Context, rg parquet.RowGroup, s *Schema) error
func github.com/polarsignals/frostdb/query/logicalplan.DataTypeForExprWithSchema(expr logicalplan.Expr, s *Schema) (arrow.DataType, error)
func github.com/polarsignals/frostdb/query/physicalplan.Build(ctx context.Context, pool memory.Allocator, tracer trace.Tracer, s *Schema, plan *logicalplan.LogicalPlan, options ...physicalplan.Option) (*physicalplan.OutputPlan, error)
func github.com/polarsignals/frostdb/storage.(*Iceberg).Scan(ctx context.Context, prefix string, _ *Schema, filter logicalplan.Expr, _ uint64, callback func(context.Context, any) error) error
(*SerializedBuffer) DynamicColumns() map[string][]string
(*SerializedBuffer) DynamicRowGroup(i int) DynamicRowGroup
(*SerializedBuffer) DynamicRows() DynamicRowReader
MultiDynamicRowGroup returns all the row groups wrapped in a single multi
row group.
(*SerializedBuffer) NumRowGroups() int
(*SerializedBuffer) NumRows() int64
(*SerializedBuffer) ParquetFile() *parquet.File
(*SerializedBuffer) Reader() *parquet.GenericReader[any]
(*SerializedBuffer) String() string
*SerializedBuffer : expvar.Var
*SerializedBuffer : fmt.Stringer
func NewSerializedBuffer(f *parquet.File) (*SerializedBuffer, error)
func ReaderFromBytes(buf []byte) (*SerializedBuffer, error)
func github.com/polarsignals/frostdb/parts.Part.AsSerializedBuffer(schema *Schema) (*SerializedBuffer, error)
func github.com/polarsignals/frostdb/pqarrow.SerializeRecord(r arrow.Record, schema *Schema) (*SerializedBuffer, error)
func github.com/polarsignals/frostdb/parts.NewParquetPart(tx uint64, buf *SerializedBuffer, options ...parts.Option) parts.Part
SortingColumn describes a column to sort by in a dynamic parquet schema.
( SortingColumn) ColumnName() string
Returns true if the column will sort values in descending order.
Returns true if the column will put null values at the beginning.
Returns the path of the column in the row group schema, omitting the name
of the root node.
SortingColumn : github.com/parquet-go/parquet-go.SortingColumn
func Ascending(column string) SortingColumn
func Descending(column string) SortingColumn
func NullsFirst(sortingColumn SortingColumn) SortingColumn
func (*Schema).SortingColumns() []SortingColumn
func NullsFirst(sortingColumn SortingColumn) SortingColumn
( StorageLayout) GetCompressionInt32() int32
( StorageLayout) GetEncodingInt32() int32
( StorageLayout) GetNullable() bool
( StorageLayout) GetRepeated() bool
( StorageLayout) GetTypeInt32() int32
func StorageLayoutWrapper(_ *schemav2pb.StorageLayout) StorageLayout
Package-Level Functions (total 33)
Ascending constructs a SortingColumn value which dictates to sort by the column in ascending order.
func Concat(fields []parquet.Field, drg ...DynamicRowGroup) DynamicRowGroup
DefinitionFromParquetFile converts a parquet file into a schemapb.Schema.
Descending constructs a SortingColumn value which dictates to sort by the column in descending order.
func FindChildIndex(fields []parquet.Field, name string) int
findHashedColumn finds the index of the column in the given fields that have been prehashed.
func HashedColumnName(col string) string func IsHashedColumn(col string) bool
MergeDeduplicatedDynCols is a light wrapper over sorting the deduplicated
dynamic column names provided in dyn. It is extracted as a public method
since this merging determines the order in which dynamic columns are stored
and components from other packages sometimes need to figure out the physical
sort order between dynamic columns.
func MergeDynamicColumnSets(sets []map[string][]string) map[string][]string func NewDynamicRow(row parquet.Row, schema *parquet.Schema, dyncols map[string][]string, fields []parquet.Field) *DynamicRow
NewDynamicRowGroupMergeAdapter returns a *DynamicRowGroupMergeAdapter, which
maps the columns of the original row group to the columns in the super-set
schema provided. This allows row groups that have non-conflicting dynamic
schemas to be merged into a single row group with a superset parquet schema.
The provided schema must not conflict with the original row group's schema
it must be strictly a superset, this property is not checked, it is assumed
to be true for performance reasons.
func NewDynamicRows(rows []parquet.Row, schema *parquet.Schema, dynamicColumns map[string][]string, fields []parquet.Field) *DynamicRows func NewDynamicRowSorter(schema *Schema, rows *DynamicRows) *DynamicRowSorter
NewNilColumnChunk creates a new column chunk configured with the given type,
column index and number of values in the page.
func NewSampleSchema() *Schema func NewSerializedBuffer(f *parquet.File) (*SerializedBuffer, error)
NullsFirst wraps the SortingColumn passed as argument so that it instructs
the row group to place null values first in the column.
func ParquetSchemaFromV2Definition(def *schemav2pb.Schema) *parquet.Schema
prehashColumns prehashes the columns in the given record that have been marked as prehashed in the given schema.
func ReaderFromBytes(buf []byte) (*SerializedBuffer, error)
RemoveHashedColumns removes the hashed columns from the record.
func SchemaFromDefinition(msg proto.Message) (*Schema, error)
SchemaFromParquetFile converts a parquet file into a dnyparquet.Schema.
func SortingColumnsFromDef(def *schemav2pb.Schema) ([]parquet.SortingColumn, error) func ToSnakeCase(str string) string func WithAlreadySorted() MergeOption func WithDynamicCols(cols map[string][]string) MergeOption
Package-Level Variables (total 9)
var GenerateTestSamples func(n int) samples.Samples var LabelColumn func(name string) *schemav1alpha2.Node var NewNestedSampleSchema func(t testing.TB) proto.Message var NewTestSamples func() samples.Samples var PrehashedSampleDefinition func() *schemav1alpha1.Schema var SampleDefinition func() *schemav1alpha1.Schema var SampleDefinitionWithFloat func() *schemav1alpha1.Schema
Package-Level Constants (total 2)
The size of the column indicies in parquet files.
const DynamicColumnsKey = "dynamic_columns"![]() |
The pages are generated with Golds v0.8.2. (GOOS=linux GOARCH=amd64) Golds is a Go 101 project developed by Tapir Liu. PR and bug reports are welcome and can be submitted to the issue list. Please follow @zigo_101 (reachable from the left QR code) to get the latest news of Golds. |