package iceberg
Import Path
github.com/polarsignals/iceberg-go (on go.dev)
Dependency Relation
imports 24 packages, and imported by 3 packages
Involved Source Files
avro_schemas.go
errors.go
manifest.go
manifest_builder.go
parquet.go
partitions.go
schema.go
transforms.go
types.go
utils.go
Package-Level Type Names (total 58)
( AfterFieldVisitor) AfterField(field NestedField)
( AfterListElementVisitor) AfterListElement(elem NestedField)
( AfterMapKeyVisitor) AfterMapKey(key NestedField)
( AfterMapValueVisitor) AfterMapValue(value NestedField)
( BeforeFieldVisitor) BeforeField(field NestedField)
( BeforeListElementVisitor) BeforeListElement(elem NestedField)
( BeforeMapKeyVisitor) BeforeMapKey(key NestedField)
( BeforeMapValueVisitor) BeforeMapValue(value NestedField)
( BinaryType) Equals(other Type) bool
( BinaryType) String() string
( BinaryType) Type() string
BinaryType : PrimitiveType
BinaryType : Type
BinaryType : expvar.Var
BinaryType : fmt.Stringer
( BooleanType) Equals(other Type) bool
( BooleanType) String() string
( BooleanType) Type() string
BooleanType : PrimitiveType
BooleanType : Type
BooleanType : expvar.Var
BooleanType : fmt.Stringer
BucketTransform transforms values into a bucket partition value. It is
parameterized by a number of buckets. Bucket partition transforms use
a 32-bit hash of the source value to produce a positive value by mod
the bucket number.
NumBuckets int
( BucketTransform) MarshalText() ([]byte, error)
( BucketTransform) ResultType(Type) Type
( BucketTransform) String() string
BucketTransform : Transform
BucketTransform : encoding.TextMarshaler
BucketTransform : expvar.Var
BucketTransform : fmt.Stringer
DataFile is the interface for reading the information about a
given data file indicated by an entry in a manifest list.
ColumnSizes is a mapping from column id to the total size on disk
of all regions that store the column. Does not include bytes
necessary to read other columns, like footers. Map will be nil for
row-oriented formats (avro).
ContentType is the type of the content stored by the data file,
either Data, Equality deletes, or Position deletes. All v1 files
are Data files.
Count returns the number of records in this file.
DistictValueCounts is a mapping from column id to the number of
distinct values in the column. Distinct counts must be derived
using values in the file by counting or using sketches, but not
using methods like merging existing distinct counts.
EqualityFieldIDs are used to determine row equality in equality
delete files. It is required when the content type is
EntryContentEqDeletes.
FileFormat is the format of the data file, AVRO, Orc, or Parquet.
FilePath is the full URI for the file, complete with FS scheme.
FileSizeBytes is the total file size in bytes.
KeyMetadata is implementation-specific key metadata for encryption.
LowerBoundValues is a mapping from column id to the lower bounded
value of the column, serialized as binary. Each value in the column
must be less than or requal to all non-null, non-NaN values in the
column for the file.
NaNValueCounts is a mapping from column id to the number of NaN
values in the column.
NullValueCounts is a mapping from column id to the number of
null values in the column.
Partition returns a mapping of field name to partition value for
each of the partition spec's fields.
SortOrderID returns the id representing the sort order for this
file, or nil if there is no sort order.
SplitOffsets are the split offsets for the data file. For example,
all row group offsets in a Parquet file. Must be sorted ascending.
UpperBoundValues is a mapping from column id to the upper bounded
value of the column, serialized as binary. Each value in the column
must be greater than or equal to all non-null, non-NaN values in
the column for the file.
ValueCounts is a mapping from column id to the number of values
in the column, including null and NaN values.
DataFileBuilder
func DataFileFromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (DataFile, *Schema, error)
func DataFileBuilder.Build() DataFile
func ManifestEntry.DataFile() DataFile
func NewManifestEntryV1(entryStatus ManifestEntryStatus, snapshotID int64, data DataFile) ManifestEntry
dataFile.BlockSizeInBytes int64
dataFile.ColSizes *[]colMap[int, int64]
dataFile.Content ManifestEntryContent
dataFile.DistinctCounts *[]colMap[int, int64]
dataFile.EqualityIDs *[]int
dataFile.FileSize int64
dataFile.Format FileFormat
dataFile.Key *[]byte
dataFile.LowerBounds *[]colMap[int, []byte]
dataFile.NaNCounts *[]colMap[int, int64]
dataFile.NullCounts *[]colMap[int, int64]
dataFile.PartitionData map[string]any
dataFile.Path string
dataFile.RecordCount int64
dataFile.SortOrder *int
dataFile.Splits *[]int64
dataFile.UpperBounds *[]colMap[int, []byte]
dataFile.ValCounts *[]colMap[int, int64]
( DataFileBuilder) Build() DataFile
( DataFileBuilder) ColumnSizes() map[int]int64
( DataFileBuilder) ContentType() ManifestEntryContent
( DataFileBuilder) Count() int64
( DataFileBuilder) DistinctValueCounts() map[int]int64
( DataFileBuilder) EqualityFieldIDs() []int
( DataFileBuilder) FileFormat() FileFormat
( DataFileBuilder) FilePath() string
( DataFileBuilder) FileSizeBytes() int64
( DataFileBuilder) KeyMetadata() []byte
( DataFileBuilder) LowerBoundValues() map[int][]byte
( DataFileBuilder) NaNValueCounts() map[int]int64
( DataFileBuilder) NullValueCounts() map[int]int64
( DataFileBuilder) Partition() map[string]any
( DataFileBuilder) SortOrderID() *int
( DataFileBuilder) SplitOffsets() []int64
( DataFileBuilder) UpperBoundValues() map[int][]byte
( DataFileBuilder) ValueCounts() map[int]int64
( DataFileBuilder) WithColumnSizes(columnSizes map[int]int64) DataFileBuilder
( DataFileBuilder) WithDistinctCounts(distinctCounts map[int]int64) DataFileBuilder
( DataFileBuilder) WithKeyMetadata(keyMetadata []byte) DataFileBuilder
( DataFileBuilder) WithLowerBounds(lowerBounds map[int][]byte) DataFileBuilder
( DataFileBuilder) WithNanValueCounts(nanValueCounts map[int]int64) DataFileBuilder
( DataFileBuilder) WithNullValueCounts(nullValueCounts map[int]int64) DataFileBuilder
( DataFileBuilder) WithSortOrderID(sortOrderID int) DataFileBuilder
( DataFileBuilder) WithSplitOffsets(splitOffsets []int64) DataFileBuilder
( DataFileBuilder) WithUpperBounds(upperBounds map[int][]byte) DataFileBuilder
( DataFileBuilder) WithValueCounts(valueCounts map[int]int64) DataFileBuilder
DataFileBuilder : DataFile
func NewDataFileV1Builder(FilePath string, FileFormat FileFormat, PartitionSpec map[string]any, RecordCount int64, FileSizeBytes int64) DataFileBuilder
func DataFileBuilder.WithColumnSizes(columnSizes map[int]int64) DataFileBuilder
func DataFileBuilder.WithDistinctCounts(distinctCounts map[int]int64) DataFileBuilder
func DataFileBuilder.WithKeyMetadata(keyMetadata []byte) DataFileBuilder
func DataFileBuilder.WithLowerBounds(lowerBounds map[int][]byte) DataFileBuilder
func DataFileBuilder.WithNanValueCounts(nanValueCounts map[int]int64) DataFileBuilder
func DataFileBuilder.WithNullValueCounts(nullValueCounts map[int]int64) DataFileBuilder
func DataFileBuilder.WithSortOrderID(sortOrderID int) DataFileBuilder
func DataFileBuilder.WithSplitOffsets(splitOffsets []int64) DataFileBuilder
func DataFileBuilder.WithUpperBounds(upperBounds map[int][]byte) DataFileBuilder
func DataFileBuilder.WithValueCounts(valueCounts map[int]int64) DataFileBuilder
DateType represents a calendar date without a timezone or time,
represented as a 32-bit integer denoting the number of days since
the unix epoch.
( DateType) Equals(other Type) bool
( DateType) String() string
( DateType) Type() string
DateType : PrimitiveType
DateType : Type
DateType : expvar.Var
DateType : fmt.Stringer
DayTransform transforms a datetime value into a date value.
( DayTransform) MarshalText() ([]byte, error)
( DayTransform) ResultType(Type) Type
( DayTransform) String() string
DayTransform : Transform
DayTransform : encoding.TextMarshaler
DayTransform : expvar.Var
DayTransform : fmt.Stringer
( DecimalType) Equals(other Type) bool
( DecimalType) Precision() int
( DecimalType) Scale() int
( DecimalType) String() string
( DecimalType) Type() string
DecimalType : Type
DecimalType : expvar.Var
DecimalType : fmt.Stringer
func DecimalTypeOf(prec, scale int) DecimalType
ContainsNaN *bool
ContainsNull bool
LowerBound *[]byte
UpperBound *[]byte
func ManifestFile.Partitions() []FieldSummary
func (*ManifestV1Builder).Partitions(p []FieldSummary) *ManifestV1Builder
func (*ManifestV2Builder).Partitions(p []FieldSummary) *ManifestV2Builder
FileFormat defines constants for the format of data files.
func DataFile.FileFormat() FileFormat
func NewDataFileV1Builder(FilePath string, FileFormat FileFormat, PartitionSpec map[string]any, RecordCount int64, FileSizeBytes int64) DataFileBuilder
const AvroFile
const OrcFile
const ParquetFile
( FixedType) Equals(other Type) bool
( FixedType) Len() int
( FixedType) String() string
( FixedType) Type() string
FixedType : Type
FixedType : expvar.Var
FixedType : fmt.Stringer
func FixedTypeOf(n int) FixedType
Float32Type is the "float" type in the iceberg spec.
( Float32Type) Equals(other Type) bool
( Float32Type) String() string
( Float32Type) Type() string
Float32Type : PrimitiveType
Float32Type : Type
Float32Type : expvar.Var
Float32Type : fmt.Stringer
Float64Type represents the "double" type of the iceberg spec.
( Float64Type) Equals(other Type) bool
( Float64Type) String() string
( Float64Type) Type() string
Float64Type : PrimitiveType
Float64Type : Type
Float64Type : expvar.Var
Float64Type : fmt.Stringer
HourTransform transforms a datetime value into an hour value.
( HourTransform) MarshalText() ([]byte, error)
( HourTransform) ResultType(Type) Type
( HourTransform) String() string
HourTransform : Transform
HourTransform : encoding.TextMarshaler
HourTransform : expvar.Var
HourTransform : fmt.Stringer
IdentityTransform uses the identity function, performing no transformation
but instead partitioning on the value itself.
( IdentityTransform) MarshalText() ([]byte, error)
( IdentityTransform) ResultType(t Type) Type
( IdentityTransform) String() string
IdentityTransform : Transform
IdentityTransform : encoding.TextMarshaler
IdentityTransform : expvar.Var
IdentityTransform : fmt.Stringer
Int32Type is the "int"/"integer" type of the iceberg spec.
( Int32Type) Equals(other Type) bool
( Int32Type) String() string
( Int32Type) Type() string
Int32Type : PrimitiveType
Int32Type : Type
Int32Type : expvar.Var
Int32Type : fmt.Stringer
Int64Type is the "long" type of the iceberg spec.
( Int64Type) Equals(other Type) bool
( Int64Type) String() string
( Int64Type) Type() string
Int64Type : PrimitiveType
Int64Type : Type
Int64Type : expvar.Var
Int64Type : fmt.Stringer
Element Type
ElementID int
ElementRequired bool
(*ListType) ElementField() NestedField
(*ListType) Equals(other Type) bool
(*ListType) Fields() []NestedField
(*ListType) MarshalJSON() ([]byte, error)
(*ListType) String() string
(*ListType) Type() string
(*ListType) UnmarshalJSON(b []byte) error
*ListType : NestedType
*ListType : Type
*ListType : github.com/goccy/go-json.Marshaler
*ListType : github.com/goccy/go-json.Unmarshaler
*ListType : encoding/json.Marshaler
*ListType : encoding/json.Unmarshaler
*ListType : expvar.Var
*ListType : fmt.Stringer
func SchemaVisitor.List(list ListType, elemResult T) T
ManifestContent indicates the type of data inside of the files
described by a manifest. This will indicate whether the data files
contain active data or deleted rows.
func ManifestFile.ManifestContent() ManifestContent
func NewManifestV2Builder(path string, length int64, partitionSpecID int32, content ManifestContent, addedSnapshotID int64) *ManifestV2Builder
const ManifestContentData
const ManifestContentDeletes
ManifestEntry is an interface for both v1 and v2 manifest entries.
DataFile provides the information about the data file indicated
by this manifest entry.
FileSequenceNum returns the file sequence number indicating
when the file was added. If it was null and the status is
EntryStatusADDED then it is inherited from the manifest list.
SequenceNum returns the data sequence number of the file.
If it was null and the status is EntryStatusADDED then it
is inherited from the manifest list.
SnapshotID is the id where the file was added, or deleted,
if null it is inherited from the manifest list.
Status returns the type of the file tracked by this entry.
Deletes are informational only and not used in scans.
func ManifestEntryV1FromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (ManifestEntry, *Schema, error)
func NewManifestEntryV1(entryStatus ManifestEntryStatus, snapshotID int64, data DataFile) ManifestEntry
func ManifestFile.FetchEntries(bucket objstore.Bucket, discardDeleted bool) ([]ManifestEntry, *Schema, error)
func AvroSchemaFromEntriesV1(entries []ManifestEntry) string
func WriteManifestV1(w io.Writer, schema *Schema, entries []ManifestEntry) error
ManifestEntryContent defines constants for the type of file contents
in the file entries. Data, Position based deletes and equality based
deletes.
func DataFile.ContentType() ManifestEntryContent
const EntryContentData
const EntryContentEqDeletes
const EntryContentPosDeletes
ManifestEntryStatus defines constants for the entry status of
existing, added or deleted.
func ManifestEntry.Status() ManifestEntryStatus
func NewManifestEntryV1(entryStatus ManifestEntryStatus, snapshotID int64, data DataFile) ManifestEntry
const EntryStatusADDED
const EntryStatusDELETED
const EntryStatusEXISTING
ManifestFile is the interface which covers both V1 and V2 manifest files.
AddedDataFiles returns the number of entries in the manifest that
have the status of EntryStatusADDED.
AddedRows returns the number of rows in all files of the manifest
that have status EntryStatusADDED.
DeletedDataFiles returns the number of entries in the manifest
which have the status of EntryStatusDELETED.
DeletedRows returns the number of rows in all files of the manifest
which have status EntryStatusDELETED.
ExistingDataFiles returns the number of entries in the manifest
which have the status of EntryStatusEXISTING.
ExistingRows returns the number of rows in all files of the manifest
which have status EntryStatusEXISTING.
FetchEntries reads the manifest list file to fetch the list of
manifest entries using the provided bucket. It will return the schema of the table
when the manifest was written.
If discardDeleted is true, entries for files containing deleted rows
will be skipped.
FilePath is the location URI of this manifest file.
HasAddedFiles returns true if AddedDataFiles > 0 or if it was null.
HasExistingFiles returns true if ExistingDataFiles > 0 or if it was null.
KeyMetadata returns implementation-specific key metadata for encryption
if it exists in the manifest list.
Length is the length in bytes of the manifest file.
ManifestContent is the type of files tracked by this manifest,
either data or delete files. All v1 manifests track data files.
MinSequenceNum is the minimum data sequence number of all live data
or delete files in the manifest. Will be 0 for v1 manifest lists.
PartitionSpecID is the ID of the partition spec used to write
this manifest. It must be listed in the table metadata
partition-specs.
Partitions returns a list of field summaries for each partition
field in the spec. Each field in the list corresponds to a field in
the manifest file's partition spec.
SequenceNum returns the sequence number when this manifest was
added to the table. Will be 0 for v1 manifest lists.
SnapshotID is the ID of the snapshot where this manifest file
was added.
Version returns the version number of this manifest file.
It should be 1 or 2.
func ReadManifestList(in io.Reader) ([]ManifestFile, error)
func (*ManifestV1Builder).Build() ManifestFile
func (*ManifestV2Builder).Build() ManifestFile
func github.com/polarsignals/iceberg-go/table.Snapshot.Manifests(bucket objstore.Bucket) ([]ManifestFile, error)
func WriteManifestListV1(w io.Writer, files []ManifestFile) error
ManifestV1Builder is a helper for building a V1 manifest file
struct which will conform to the ManifestFile interface.
(*ManifestV1Builder) AddedFiles(cnt int32) *ManifestV1Builder
(*ManifestV1Builder) AddedRows(cnt int64) *ManifestV1Builder
Build returns the constructed manifest file, after calling Build this
builder should not be used further as we avoid copying by just returning
a pointer to the constructed manifest file. Further calls to the modifier
methods after calling build would modify the constructed ManifestFile.
(*ManifestV1Builder) DeletedFiles(cnt int32) *ManifestV1Builder
(*ManifestV1Builder) DeletedRows(cnt int64) *ManifestV1Builder
(*ManifestV1Builder) ExistingFiles(cnt int32) *ManifestV1Builder
(*ManifestV1Builder) ExistingRows(cnt int64) *ManifestV1Builder
(*ManifestV1Builder) KeyMetadata(km []byte) *ManifestV1Builder
(*ManifestV1Builder) Partitions(p []FieldSummary) *ManifestV1Builder
func NewManifestV1Builder(path string, length int64, partitionSpecID int32, addedSnapshotID int64) *ManifestV1Builder
func (*ManifestV1Builder).AddedFiles(cnt int32) *ManifestV1Builder
func (*ManifestV1Builder).AddedRows(cnt int64) *ManifestV1Builder
func (*ManifestV1Builder).DeletedFiles(cnt int32) *ManifestV1Builder
func (*ManifestV1Builder).DeletedRows(cnt int64) *ManifestV1Builder
func (*ManifestV1Builder).ExistingFiles(cnt int32) *ManifestV1Builder
func (*ManifestV1Builder).ExistingRows(cnt int64) *ManifestV1Builder
func (*ManifestV1Builder).KeyMetadata(km []byte) *ManifestV1Builder
func (*ManifestV1Builder).Partitions(p []FieldSummary) *ManifestV1Builder
ManifestV2Builder is a helper for building a V2 manifest file
struct which will conform to the ManifestFile interface.
(*ManifestV2Builder) AddedFiles(cnt int32) *ManifestV2Builder
(*ManifestV2Builder) AddedRows(cnt int64) *ManifestV2Builder
Build returns the constructed manifest file, after calling Build this
builder should not be used further as we avoid copying by just returning
a pointer to the constructed manifest file. Further calls to the modifier
methods after calling build would modify the constructed ManifestFile.
(*ManifestV2Builder) DeletedFiles(cnt int32) *ManifestV2Builder
(*ManifestV2Builder) DeletedRows(cnt int64) *ManifestV2Builder
(*ManifestV2Builder) ExistingFiles(cnt int32) *ManifestV2Builder
(*ManifestV2Builder) ExistingRows(cnt int64) *ManifestV2Builder
(*ManifestV2Builder) KeyMetadata(km []byte) *ManifestV2Builder
(*ManifestV2Builder) Partitions(p []FieldSummary) *ManifestV2Builder
(*ManifestV2Builder) SequenceNum(num, minSeqNum int64) *ManifestV2Builder
func NewManifestV2Builder(path string, length int64, partitionSpecID int32, content ManifestContent, addedSnapshotID int64) *ManifestV2Builder
func (*ManifestV2Builder).AddedFiles(cnt int32) *ManifestV2Builder
func (*ManifestV2Builder).AddedRows(cnt int64) *ManifestV2Builder
func (*ManifestV2Builder).DeletedFiles(cnt int32) *ManifestV2Builder
func (*ManifestV2Builder).DeletedRows(cnt int64) *ManifestV2Builder
func (*ManifestV2Builder).ExistingFiles(cnt int32) *ManifestV2Builder
func (*ManifestV2Builder).ExistingRows(cnt int64) *ManifestV2Builder
func (*ManifestV2Builder).KeyMetadata(km []byte) *ManifestV2Builder
func (*ManifestV2Builder).Partitions(p []FieldSummary) *ManifestV2Builder
func (*ManifestV2Builder).SequenceNum(num, minSeqNum int64) *ManifestV2Builder
KeyID int
KeyType Type
ValueID int
ValueRequired bool
ValueType Type
(*MapType) Equals(other Type) bool
(*MapType) Fields() []NestedField
(*MapType) KeyField() NestedField
(*MapType) MarshalJSON() ([]byte, error)
(*MapType) String() string
(*MapType) Type() string
(*MapType) UnmarshalJSON(b []byte) error
(*MapType) ValueField() NestedField
*MapType : NestedType
*MapType : Type
*MapType : github.com/goccy/go-json.Marshaler
*MapType : github.com/goccy/go-json.Unmarshaler
*MapType : encoding/json.Marshaler
*MapType : encoding/json.Unmarshaler
*MapType : expvar.Var
*MapType : fmt.Stringer
func SchemaVisitor.Map(mapType MapType, keyResult, valueResult T) T
MonthTransform transforms a datetime value into a month value.
( MonthTransform) MarshalText() ([]byte, error)
( MonthTransform) ResultType(Type) Type
( MonthTransform) String() string
MonthTransform : Transform
MonthTransform : encoding.TextMarshaler
MonthTransform : expvar.Var
MonthTransform : fmt.Stringer
Doc string
ID int
InitialDefault any
Name string
Required bool
Type Type
WriteDefault any
(*NestedField) Equals(other NestedField) bool
( NestedField) MarshalJSON() ([]byte, error)
( NestedField) String() string
(*NestedField) UnmarshalJSON(b []byte) error
NestedField : github.com/goccy/go-json.Marshaler
*NestedField : github.com/goccy/go-json.Unmarshaler
NestedField : encoding/json.Marshaler
*NestedField : encoding/json.Unmarshaler
NestedField : expvar.Var
NestedField : fmt.Stringer
func IndexByID(schema *Schema) (map[int]NestedField, error)
func (*ListType).ElementField() NestedField
func (*ListType).Fields() []NestedField
func (*MapType).Fields() []NestedField
func (*MapType).KeyField() NestedField
func (*MapType).ValueField() NestedField
func NestedType.Fields() []NestedField
func (*Schema).Field(i int) NestedField
func (*Schema).Fields() []NestedField
func (*Schema).FindFieldByID(id int) (NestedField, bool)
func (*Schema).FindFieldByName(name string) (NestedField, bool)
func (*Schema).FindFieldByNameCaseInsensitive(name string) (NestedField, bool)
func (*StructType).Fields() []NestedField
func NewSchema(id int, fields ...NestedField) *Schema
func NewSchemaWithIdentifiers(id int, identifierIDs []int, fields ...NestedField) *Schema
func AfterFieldVisitor.AfterField(field NestedField)
func AfterListElementVisitor.AfterListElement(elem NestedField)
func AfterMapKeyVisitor.AfterMapKey(key NestedField)
func AfterMapValueVisitor.AfterMapValue(value NestedField)
func BeforeFieldVisitor.BeforeField(field NestedField)
func BeforeListElementVisitor.BeforeListElement(elem NestedField)
func BeforeMapKeyVisitor.BeforeMapKey(key NestedField)
func BeforeMapValueVisitor.BeforeMapValue(value NestedField)
func (*NestedField).Equals(other NestedField) bool
func SchemaVisitor.Field(field NestedField, fieldResult T) T
NestedType is an interface that allows access to the child fields of
a nested type such as a list/struct/map type.
( NestedType) Equals(Type) bool
( NestedType) Fields() []NestedField
( NestedType) String() string
( NestedType) Type() string
*ListType
*MapType
*StructType
NestedType : Type
NestedType : expvar.Var
NestedType : fmt.Stringer
PartitionField represents how one partition value is derived from the
source column by transformation.
FieldID is the partition field id across all the table partition specs
Name is the name of the partition field itself
SourceID is the source column id of the table's schema
Transform is the transform used to produce the partition value
(*PartitionField) String() string
(*PartitionField) UnmarshalJSON(b []byte) error
*PartitionField : github.com/goccy/go-json.Unmarshaler
*PartitionField : encoding/json.Unmarshaler
*PartitionField : expvar.Var
*PartitionField : fmt.Stringer
func (*PartitionSpec).Field(i int) PartitionField
func (*PartitionSpec).FieldsBySourceID(fieldID int) []PartitionField
func NewPartitionSpec(fields ...PartitionField) PartitionSpec
func NewPartitionSpecID(id int, fields ...PartitionField) PartitionSpec
PartitionSpec captures the transformation from table data to partition values
CompatibleWith returns true if this partition spec is considered
compatible with the passed in partition spec. This means that the two
specs have equivalent field lists regardless of the spec id.
Equals returns true iff the field lists are the same AND the spec id
is the same between this partition spec and the provided one.
(*PartitionSpec) Field(i int) PartitionField
(*PartitionSpec) FieldsBySourceID(fieldID int) []PartitionField
(*PartitionSpec) ID() int
(*PartitionSpec) IsUnpartitioned() bool
(*PartitionSpec) LastAssignedFieldID() int
( PartitionSpec) MarshalJSON() ([]byte, error)
(*PartitionSpec) NumFields() int
PartitionType produces a struct of the partition spec.
The partition fields should be optional:
- All partition transforms are required to produce null if the input value
is null. This can happen when the source column is optional.
- Partition fields may be added later, in which case not all files would
have the result field and it may be null.
There is a case where we can guarantee that a partition field in the first
and only parittion spec that uses a required source column will never be
null, but it doesn't seem worth tracking this case.
( PartitionSpec) String() string
(*PartitionSpec) UnmarshalJSON(b []byte) error
PartitionSpec : github.com/goccy/go-json.Marshaler
*PartitionSpec : github.com/goccy/go-json.Unmarshaler
PartitionSpec : encoding/json.Marshaler
*PartitionSpec : encoding/json.Unmarshaler
PartitionSpec : expvar.Var
PartitionSpec : fmt.Stringer
func NewPartitionSpec(fields ...PartitionField) PartitionSpec
func NewPartitionSpecID(id int, fields ...PartitionField) PartitionSpec
func github.com/polarsignals/iceberg-go/table.Metadata.PartitionSpec() PartitionSpec
func github.com/polarsignals/iceberg-go/table.Metadata.PartitionSpecs() []PartitionSpec
func github.com/polarsignals/iceberg-go/table.Table.Spec() PartitionSpec
func (*PartitionSpec).CompatibleWith(other *PartitionSpec) bool
func (*PartitionSpec).Equals(other PartitionSpec) bool
func github.com/polarsignals/iceberg-go/catalog.WithPartitionSpec(spec PartitionSpec) catalog.TableOption
func github.com/polarsignals/iceberg-go/table.(*MetadataV1Builder).WithPartitionSpecs(specs []PartitionSpec) *table.MetadataV1Builder
func github.com/polarsignals/frostdb/storage.WithIcebergPartitionSpec(spec PartitionSpec) storage.IcebergOption
var UnpartitionedSpec *PartitionSpec
( PrimitiveType) Equals(Type) bool
( PrimitiveType) String() string
( PrimitiveType) Type() string
BinaryType
BooleanType
DateType
Float32Type
Float64Type
Int32Type
Int64Type
StringType
TimestampType
TimestampTzType
TimeType
UUIDType
PrimitiveType : Type
PrimitiveType : expvar.Var
PrimitiveType : fmt.Stringer
func SchemaVisitor.Primitive(p PrimitiveType) T
func github.com/polarsignals/iceberg-go/catalog.Catalog.LoadNamespaceProperties(ctx context.Context, namespace table.Identifier) (Properties, error)
func github.com/polarsignals/iceberg-go/table.Metadata.Properties() Properties
func github.com/polarsignals/iceberg-go/table.Table.Properties() Properties
func github.com/polarsignals/iceberg-go/catalog.Catalog.CreateNamespace(ctx context.Context, namespace table.Identifier, props Properties) error
func github.com/polarsignals/iceberg-go/catalog.Catalog.CreateTable(ctx context.Context, location string, schema *Schema, props Properties, options ...catalog.TableOption) (table.Table, error)
func github.com/polarsignals/iceberg-go/catalog.Catalog.LoadTable(ctx context.Context, identifier table.Identifier, props Properties) (table.Table, error)
func github.com/polarsignals/iceberg-go/catalog.Catalog.UpdateNamespaceProperties(ctx context.Context, namespace table.Identifier, removals []string, updates Properties) (catalog.PropertiesUpdateSummary, error)
func github.com/polarsignals/iceberg-go/table.(*MetadataV1Builder).WithProperties(properties Properties) *table.MetadataV1Builder
Schema is an Iceberg table schema, represented as a struct with
multiple fields. The fields are only exported via accessor methods
rather than exposing the slice directly in order to ensure a schema
as immutable.
ID int
IdentifierFieldIDs []int
AsStruct returns a Struct with the same fields as the schema which can
then be used as a Type.
Equals compares the fields and identifierIDs, but does not compare
the schema ID itself.
(*Schema) Field(i int) NestedField
(*Schema) Fields() []NestedField
FindColumnName returns the name of the column identified by the
passed in field id. The second return value reports whether or
not the field id was found in the schema.
FindFieldByID is like [*Schema.FindColumnName], but returns the whole
field rather than just the field name.
FindFieldByName returns the field identified by the name given,
the second return value will be false if no field by this name
is found.
Note: This search is done in a case sensitive manner. To perform
a case insensitive search, use [*Schema.FindFieldByNameCaseInsensitive].
FindFieldByNameCaseInsensitive is like [*Schema.FindFieldByName],
but performs a case insensitive search.
FindTypeByID is like [*Schema.FindFieldByID], but returns only the data
type of the field.
FindTypeByName is a convenience function for calling [*Schema.FindFieldByName],
and then returning just the type.
FindTypeByNameCaseInsensitive is like [*Schema.FindTypeByName] but
performs a case insensitive search.
HighestFieldID returns the value of the numerically highest field ID
in this schema.
(*Schema) MarshalJSON() ([]byte, error)
Merge combines two schemas into a single schema. It returns a schema with an ID that is one greater thatn the ID of the first schema.
If the two schemas have the same fields, the first schema is returned.
(*Schema) NumFields() int
Select creates a new schema with just the fields identified by name
passed in the order they are provided. If caseSensitive is false,
then fields will be identified by case insensitive search.
An error is returned if a requested name cannot be found.
(*Schema) String() string
(*Schema) Type() string
(*Schema) UnmarshalJSON(b []byte) error
*Schema : github.com/goccy/go-json.Marshaler
*Schema : github.com/goccy/go-json.Unmarshaler
*Schema : encoding/json.Marshaler
*Schema : encoding/json.Unmarshaler
*Schema : expvar.Var
*Schema : fmt.Stringer
func DataFileFromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (DataFile, *Schema, error)
func ManifestEntryV1FromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (ManifestEntry, *Schema, error)
func NewSchema(id int, fields ...NestedField) *Schema
func NewSchemaWithIdentifiers(id int, identifierIDs []int, fields ...NestedField) *Schema
func PruneColumns(schema *Schema, selected map[int]Void, selectFullTypes bool) (*Schema, error)
func ManifestFile.FetchEntries(bucket objstore.Bucket, discardDeleted bool) ([]ManifestEntry, *Schema, error)
func (*Schema).Merge(other *Schema) (*Schema, error)
func (*Schema).Select(caseSensitive bool, names ...string) (*Schema, error)
func github.com/polarsignals/iceberg-go/table.Metadata.CurrentSchema() *Schema
func github.com/polarsignals/iceberg-go/table.Metadata.Schemas() []*Schema
func github.com/polarsignals/iceberg-go/table.(*MetadataV1).CurrentSchema() *Schema
func github.com/polarsignals/iceberg-go/table.Table.Schema() *Schema
func github.com/polarsignals/iceberg-go/table.Table.Schemas() map[int]*Schema
func DataFileFromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (DataFile, *Schema, error)
func IndexByID(schema *Schema) (map[int]NestedField, error)
func IndexByName(schema *Schema) (map[string]int, error)
func IndexNameByID(schema *Schema) (map[int]string, error)
func ManifestEntryV1FromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (ManifestEntry, *Schema, error)
func PruneColumns(schema *Schema, selected map[int]Void, selectFullTypes bool) (*Schema, error)
func Visit[T](sc *Schema, visitor SchemaVisitor[T]) (res T, err error)
func WriteManifestV1(w io.Writer, schema *Schema, entries []ManifestEntry) error
func (*PartitionSpec).PartitionType(schema *Schema) *StructType
func (*Schema).Equals(other *Schema) bool
func (*Schema).Merge(other *Schema) (*Schema, error)
func SchemaVisitor.Schema(schema *Schema, structResult T) T
func github.com/polarsignals/iceberg-go/catalog.Catalog.CreateTable(ctx context.Context, location string, schema *Schema, props Properties, options ...catalog.TableOption) (table.Table, error)
func github.com/polarsignals/iceberg-go/table.NewMetadataV1Builder(location string, schema *Schema, lastUpdatesMs int64, lastColumnId int) *table.MetadataV1Builder
func github.com/polarsignals/iceberg-go/table.(*MetadataV1Builder).WithSchema(schema *Schema) *table.MetadataV1Builder
func github.com/polarsignals/iceberg-go/table.(*MetadataV1Builder).WithSchemas(schemas []*Schema) *table.MetadataV1Builder
var PositionalDeleteSchema *Schema
Type Parameters:
T: any
SchemaVisitor is an interface that can be implemented to allow for
easy traversal and processing of a schema.
A SchemaVisitor can also optionally implement the Before/After Field,
ListElement, MapKey, or MapValue interfaces to allow them to get called
at the appropriate points within schema traversal.
( SchemaVisitor[T]) Field(field NestedField, fieldResult T) T
( SchemaVisitor[T]) List(list ListType, elemResult T) T
( SchemaVisitor[T]) Map(mapType MapType, keyResult, valueResult T) T
( SchemaVisitor[T]) Primitive(p PrimitiveType) T
( SchemaVisitor[T]) Schema(schema *Schema, structResult T) T
( SchemaVisitor[T]) Struct(st StructType, fieldResults []T) T
func Visit[T](sc *Schema, visitor SchemaVisitor[T]) (res T, err error)
( StringType) Equals(other Type) bool
( StringType) String() string
( StringType) Type() string
StringType : PrimitiveType
StringType : Type
StringType : expvar.Var
StringType : fmt.Stringer
FieldList []NestedField
(*StructType) Equals(other Type) bool
(*StructType) Fields() []NestedField
(*StructType) MarshalJSON() ([]byte, error)
(*StructType) String() string
(*StructType) Type() string
*StructType : NestedType
*StructType : Type
*StructType : github.com/goccy/go-json.Marshaler
*StructType : encoding/json.Marshaler
*StructType : expvar.Var
*StructType : fmt.Stringer
func (*PartitionSpec).PartitionType(schema *Schema) *StructType
func (*Schema).AsStruct() StructType
func SchemaVisitor.Struct(st StructType, fieldResults []T) T
TimestampType represents a number of microseconds since the unix epoch
without regard for timezone.
( TimestampType) Equals(other Type) bool
( TimestampType) String() string
( TimestampType) Type() string
TimestampType : PrimitiveType
TimestampType : Type
TimestampType : expvar.Var
TimestampType : fmt.Stringer
TimestampTzType represents a timestamp stored as UTC representing the
number of microseconds since the unix epoch.
( TimestampTzType) Equals(other Type) bool
( TimestampTzType) String() string
( TimestampTzType) Type() string
TimestampTzType : PrimitiveType
TimestampTzType : Type
TimestampTzType : expvar.Var
TimestampTzType : fmt.Stringer
TimeType represents a number of microseconds since midnight.
( TimeType) Equals(other Type) bool
( TimeType) String() string
( TimeType) Type() string
TimeType : PrimitiveType
TimeType : Type
TimeType : expvar.Var
TimeType : fmt.Stringer
Transform is an interface for the various Transformation types
in partition specs. Currently, they do not yet provide actual
transformation functions or implementation. That will come later as
data reading gets implemented.
( Transform) MarshalText() (text []byte, err error)
( Transform) ResultType(t Type) Type
( Transform) String() string
BucketTransform
DayTransform
HourTransform
IdentityTransform
MonthTransform
TruncateTransform
VoidTransform
YearTransform
Transform : encoding.TextMarshaler
Transform : expvar.Var
Transform : fmt.Stringer
func ParseTransform(s string) (Transform, error)
TruncateTransform is a transformation for truncating a value to a specified width.
Width int
( TruncateTransform) MarshalText() ([]byte, error)
( TruncateTransform) ResultType(t Type) Type
( TruncateTransform) String() string
TruncateTransform : Transform
TruncateTransform : encoding.TextMarshaler
TruncateTransform : expvar.Var
TruncateTransform : fmt.Stringer
Type is an interface representing any of the available iceberg types,
such as primitives (int32/int64/etc.) or nested types (list/struct/map).
( Type) Equals(Type) bool
( Type) String() string
( Type) Type() string
BinaryType
BooleanType
DateType
DecimalType
FixedType
Float32Type
Float64Type
Int32Type
Int64Type
*ListType
*MapType
NestedType (interface)
PrimitiveType (interface)
StringType
*StructType
TimestampType
TimestampTzType
TimeType
UUIDType
Type : expvar.Var
Type : fmt.Stringer
func BucketTransform.ResultType(Type) Type
func DayTransform.ResultType(Type) Type
func HourTransform.ResultType(Type) Type
func IdentityTransform.ResultType(t Type) Type
func MonthTransform.ResultType(Type) Type
func (*Schema).FindTypeByID(id int) (Type, bool)
func (*Schema).FindTypeByName(name string) (Type, bool)
func (*Schema).FindTypeByNameCaseInsensitive(name string) (Type, bool)
func Transform.ResultType(t Type) Type
func TruncateTransform.ResultType(t Type) Type
func VoidTransform.ResultType(t Type) Type
func YearTransform.ResultType(Type) Type
func BinaryType.Equals(other Type) bool
func BooleanType.Equals(other Type) bool
func BucketTransform.ResultType(Type) Type
func DateType.Equals(other Type) bool
func DayTransform.ResultType(Type) Type
func DecimalType.Equals(other Type) bool
func FixedType.Equals(other Type) bool
func Float32Type.Equals(other Type) bool
func Float64Type.Equals(other Type) bool
func HourTransform.ResultType(Type) Type
func IdentityTransform.ResultType(t Type) Type
func Int32Type.Equals(other Type) bool
func Int64Type.Equals(other Type) bool
func (*ListType).Equals(other Type) bool
func (*MapType).Equals(other Type) bool
func MonthTransform.ResultType(Type) Type
func NestedType.Equals(Type) bool
func PrimitiveType.Equals(Type) bool
func StringType.Equals(other Type) bool
func (*StructType).Equals(other Type) bool
func TimestampType.Equals(other Type) bool
func TimestampTzType.Equals(other Type) bool
func TimeType.Equals(other Type) bool
func Transform.ResultType(t Type) Type
func TruncateTransform.ResultType(t Type) Type
func Type.Equals(Type) bool
func UUIDType.Equals(other Type) bool
func VoidTransform.ResultType(t Type) Type
func YearTransform.ResultType(Type) Type
( UUIDType) Equals(other Type) bool
( UUIDType) String() string
( UUIDType) Type() string
UUIDType : PrimitiveType
UUIDType : Type
UUIDType : expvar.Var
UUIDType : fmt.Stringer
type Void = (struct)
VoidTransform is a transformation that always returns nil.
( VoidTransform) MarshalText() ([]byte, error)
( VoidTransform) ResultType(t Type) Type
( VoidTransform) String() string
VoidTransform : Transform
VoidTransform : encoding.TextMarshaler
VoidTransform : expvar.Var
VoidTransform : fmt.Stringer
YearTransform transforms a datetime value into a year value.
( YearTransform) MarshalText() ([]byte, error)
( YearTransform) ResultType(Type) Type
( YearTransform) String() string
YearTransform : Transform
YearTransform : encoding.TextMarshaler
YearTransform : expvar.Var
YearTransform : fmt.Stringer
Package-Level Functions (total 23)
AvroSchemaFromEntriesV1 creates an Avro schema from the given manifest entries.
The entries must all share the same partition spec.
func DataFileFromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (DataFile, *Schema, error) func DecimalTypeOf(prec, scale int) DecimalType func FixedTypeOf(n int) FixedType
IndexByID performs a post-order traversal of the given schema and
returns a mapping from field ID to field.
IndexByName performs a post-order traversal of the schema and returns
a mapping from field name to field ID.
IndexNameByID performs a post-order traversal of the schema and returns
a mapping from field ID to field name.
func ManifestEntryV1FromParquet(path string, size int64, schema *Schema, r io.ReaderAt) (ManifestEntry, *Schema, error) func NewDataFileV1Builder(FilePath string, FileFormat FileFormat, PartitionSpec map[string]any, RecordCount int64, FileSizeBytes int64) DataFileBuilder func NewManifestEntryV1(entryStatus ManifestEntryStatus, snapshotID int64, data DataFile) ManifestEntry
NewManifestV1Builder is passed all of the required fields and then allows
all of the optional fields to be set by calling the corresponding methods
before calling [ManifestV1Builder.Build] to construct the object.
NewManifestV2Builder is constructed with the primary fields, with the remaining
fields set to their zero value unless modified by calling the corresponding
methods of the builder. Then calling [ManifestV2Builder.Build] to retrieve the
constructed ManifestFile.
func NewPartitionSpec(fields ...PartitionField) PartitionSpec func NewPartitionSpecID(id int, fields ...PartitionField) PartitionSpec
NewSchema constructs a new schema with the provided ID
and list of fields.
NewSchemaWithIdentifiers constructs a new schema with the provided ID
and fields, along with a slice of field IDs to be listed as identifier
fields.
ParseTransform takes the string representation of a transform as
defined in the iceberg spec, and produces the appropriate Transform
object or an error if the string is not a valid transform string.
PruneColumns visits a schema pruning any columns which do not exist in the
provided selected set. Parent fields of a selected child will be retained.
ReadManifestList reads in an avro manifest list file and returns a slice
of manifest files or an error if one is encountered.
Type Parameters:
T: any
Visit accepts a visitor and performs a post-order traversal of the given schema.
func WriteManifestListV1(w io.Writer, files []ManifestFile) error func WriteManifestV1(w io.Writer, schema *Schema, entries []ManifestEntry) error
Package-Level Variables (total 8)
var PrimitiveTypes struct{Bool PrimitiveType; Int32 PrimitiveType; Int64 PrimitiveType; Float32 PrimitiveType; Float64 PrimitiveType; Date PrimitiveType; Time PrimitiveType; Timestamp PrimitiveType; TimestampTz PrimitiveType; String PrimitiveType; Binary PrimitiveType; UUID PrimitiveType}
UnpartitionedSpec is the default unpartitioned spec which can
be used for comparisons or to just provide a convenience for referencing
the same unpartitioned spec object.
Package-Level Constants (total 16)
EntryV1SchemaTmpl is a Go text/template template for the Avro schema of a v1 manifest entry.
It expects a map[string]any as the partitions as as the templated object. It calls a custom Type function to determine the Avro type for each partition value.
It also calls a PartitionFieldID function to determine the field-id for each partition value.
const AvroFile FileFormat = "AVRO" const AvroManifestEntryV2Schema = "{\n \"type\": \"record\",\n \"name\": \"manifest_entry... const AvroManifestListV1Schema = "{\n\t\t\"type\": \"record\",\n\t\t\"name\": \"manifest_file\",\n\t\t... const AvroManifestListV2Schema = "{\n \"type\": \"record\",\n \"name\": \"manifest_file\... const EntryContentData ManifestEntryContent = 0 const EntryContentEqDeletes ManifestEntryContent = 2 const EntryContentPosDeletes ManifestEntryContent = 1 const EntryStatusADDED ManifestEntryStatus = 1 const EntryStatusDELETED ManifestEntryStatus = 2 const EntryStatusEXISTING ManifestEntryStatus = 0 const InitialPartitionSpecID = 0 const ManifestContentData ManifestContent = 0 const ManifestContentDeletes ManifestContent = 1 const OrcFile FileFormat = "ORC" const ParquetFile FileFormat = "PARQUET"![]() |
The pages are generated with Golds v0.8.2. (GOOS=linux GOARCH=amd64) Golds is a Go 101 project developed by Tapir Liu. PR and bug reports are welcome and can be submitted to the issue list. Please follow @zigo_101 (reachable from the left QR code) to get the latest news of Golds. |