Skip to content

Commit

Permalink
Prep towards 1.x release:
Browse files Browse the repository at this point in the history
- More docs
- Error types renamed
- Privatize BinaryEncoder and BinaryDecoder
- Create a single "NewDatumWriter" to eventually replace the
  Specific/Generic versions.
  • Loading branch information
James Crasta committed Sep 27, 2017
1 parent ed126be commit 3f67a23
Show file tree
Hide file tree
Showing 16 changed files with 379 additions and 166 deletions.
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ go:
- 1.6
- 1.7
- 1.8
- 1.9
36 changes: 36 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Changelog

#### Version 0.2 (not yet released)

Intention: start making changes towards a 1.0 release.

API Changes:
- The `BinaryEncoder` type is now a private type. `avro.NewBinaryEncoder()`
now returns a value of the `Encoder` interface.
- The `BinaryDecoder` type is now also a private type. `avro.NewBinaryDecoder()`
now returns a value of the `Decoder` interface.
- Rename the `Writer` and `Reader` interfaces to `Marshaler` and `Unmarshaler` to
be more like the JSON encoder and also use similar method names.
- Rename error types `FooBar` to be `ErrFooBar`

Improvements:
- Major improvement to docs and docs coverage


#### Version 0.1 (2017-08-23)

- First version after forking from elodina.
- Started a semver-considering API, using the gopkg.in interface,
and planning for a 1.x release.

Improvements:
- Error reporting: specify which field is missing when throwing FieldDoesNotExist
[#5](https://github.com/go-avro/avro/pull/5)
- Speedup encoding for strings and bools
[#6](https://github.com/go-avro/avro/pull/6)
- Can prepare schemas which are self-recursive and co-recursive.

Bug Fixes:
- Can decode maps of non-primitive types [#2](https://github.com/go-avro/avro/pull/2)
- Fix encoding of 'fixed' type [#3](https://github.com/go-avro/avro/pull/3) [elodina/#78](https://github.com/elodina/go-avro/issues/78)
- Fix encoding of boolean when used in a type union [#4](https://github.com/go-avro/avro/pull/4)
36 changes: 25 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,23 @@
Apache Avro for Golang
=====================
(forked from `elodina/go-avro`)
# Apache Avro for Golang

[![Build Status](https://travis-ci.org/go-avro/avro.svg?branch=master)](https://travis-ci.org/go-avro/avro)
[![Build Status](https://travis-ci.org/go-avro/avro.svg?branch=master)](https://travis-ci.org/go-avro/avro) [![GoDoc](https://godoc.org/gopkg.in/avro.v0?status.svg)](https://godoc.org/gopkg.in/avro.v0)

About This fork
---------------

This fork has separated from elodina/go-avro in December 2016 because of the project not responding to PR's since around May 2016. Have tried to contact them to get maintainer access but the original maintainer no longer is able to make those changes, so I've forked currently. If elodina/go-avro returns, they are free to merge all the changes I've made back.
Support for decoding/encoding avro using both map-style access (GenericRecord) and to/from arbitrary Go structs (SpecificRecord).

This library started as a fork of `elodina/go-avro` but has now proceeded to become a maintained library.

Documentation
-------------
## Installation

Installation is as easy as follows:
Installation via go get:

`go get gopkg.in/avro.v0`
go get gopkg.in/avro.v0


## Documentation

* [Read API/usage docs on Godoc](https://godoc.org/gopkg.in/avro.v0)
* [Changelog](CHANGELOG.md)

Some usage examples are located in [examples folder](https://github.com/go-avro/avro/tree/master/examples):

Expand All @@ -24,3 +26,15 @@ Some usage examples are located in [examples folder](https://github.com/go-avro/
* [SpecificDatumReader/Writer](https://github.com/go-avro/avro/blob/master/examples/specific_datum/specific_datum.go)
* [Schema loading](https://github.com/go-avro/avro/blob/master/examples/load_schema/load_schema.go)
* Code gen support available in [codegen folder](https://github.com/go-avro/avro/tree/master/codegen)


## About This fork

This fork separated from elodina/go-avro in December 2016 because of the
project not responding to PR's since around May 2016. Had tried to contact them
to get maintainer access but the original maintainer no longer is able to make
those changes.

Originally, we were waiting in hope the elodina maintainer would return, but it
hasn't happened, so the plan now is to proceed with this as its own library and
take PRs, push for feature additions and version bumps.
21 changes: 6 additions & 15 deletions codegen.go
Original file line number Diff line number Diff line change
@@ -1,20 +1,11 @@
/* Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
codegen work for additional information regarding copyright ownership.
The ASF licenses codegen file to You under the Apache License, Version 2.0
(the "License"); you may not use codegen file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

package avro

// ***********************
// NOTICE this file was changed beginning in 2017 by the team maintaining
// https://github.com/go-avro/avro. This notice is required to be here due to the
// terms of the Apache license, see LICENSE for details.
// ***********************

import (
"bytes"
"errors"
Expand Down
24 changes: 8 additions & 16 deletions codegen/codegen.go
Original file line number Diff line number Diff line change
@@ -1,27 +1,19 @@
/* Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

package main

// ***********************
// NOTICE this file was changed beginning in November 2016 by the team maintaining
// https://github.com/go-avro/avro. This notice is required to be here due to the
// terms of the Apache license, see LICENSE for details.
// ***********************

import (
"flag"
"fmt"
"github.com/elodina/go-avro"
"io/ioutil"
"os"
"strings"

"gopkg.in/avro.v0"
)

type schemas []string
Expand Down
41 changes: 24 additions & 17 deletions data_file.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,17 @@ const objHeaderSchemaRaw = `{"type": "record", "name": "org.apache.avro.file.Hea
]
}`

var objHeaderSchema = MustParseSchema(objHeaderSchemaRaw)
var objHeaderSchema = Prepare(MustParseSchema(objHeaderSchemaRaw))

const (
version byte = 1
syncSize = 16
schemaKey = "avro.schema"
codecKey = "avro.codec"
containerMagicVersion byte = 1
containerSyncSize = 16

schemaKey = "avro.schema"
codecKey = "avro.codec"
)

var magic = []byte{'O', 'b', 'j', version}
var magic = []byte{'O', 'b', 'j', containerMagicVersion}

// DataFileReader is a reader for Avro Object Container Files.
// More here: https://avro.apache.org/docs/current/spec.html#Object+Container+Files
Expand All @@ -48,7 +49,7 @@ type objFileHeader struct {
Sync []byte `avro:"sync"`
}

func readObjFileHeader(dec *BinaryDecoder) (*objFileHeader, error) {
func readObjFileHeader(dec *binaryDecoder) (*objFileHeader, error) {
reader := NewSpecificDatumReader()
reader.SetSchema(objHeaderSchema)
header := &objFileHeader{}
Expand All @@ -70,7 +71,7 @@ func NewDataFileReader(filename string, datumReader DatumReader) (*DataFileReade
// separated out mainly for testing currently, will be refactored later for io.Reader paradigm
func newDataFileReaderBytes(buf []byte, datumReader DatumReader) (reader *DataFileReader, err error) {
if len(buf) < len(magic) || !bytes.Equal(magic, buf[0:4]) {
return nil, NotAvroFile
return nil, ErrNotAvroFile
}

dec := NewBinaryDecoder(buf)
Expand All @@ -82,7 +83,7 @@ func newDataFileReaderBytes(buf []byte, datumReader DatumReader) (reader *DataFi
datum: datumReader,
}

if reader.header, err = readObjFileHeader(dec); err != nil {
if reader.header, err = readObjFileHeader(dec.(*binaryDecoder)); err != nil {
return nil, err
}

Expand Down Expand Up @@ -110,7 +111,7 @@ func (reader *DataFileReader) Seek(pos int64) {
func (reader *DataFileReader) hasNext() (bool, error) {
if reader.block.BlockRemaining == 0 {
if int64(reader.block.BlockSize) != reader.blockDecoder.Tell() {
return false, BlockNotFinished
return false, ErrBlockNotFinished
}
if reader.hasNextBlock() {
if err := reader.NextBlock(); err != nil {
Expand Down Expand Up @@ -177,13 +178,13 @@ func (reader *DataFileReader) NextBlock() error {
if err != nil {
return err
}
syncBuffer := make([]byte, syncSize)
syncBuffer := make([]byte, containerSyncSize)
err = reader.dec.ReadFixed(syncBuffer)
if err != nil {
return err
}
if !bytes.Equal(syncBuffer, reader.header.Sync) {
return InvalidSync
return ErrInvalidSync
}
reader.blockDecoder.SetBlock(reader.block)

Expand All @@ -195,21 +196,27 @@ func (reader *DataFileReader) NextBlock() error {
// DataFileWriter lets you write object container files.
type DataFileWriter struct {
output io.Writer
outputEnc *BinaryEncoder
outputEnc *binaryEncoder
datumWriter DatumWriter
sync []byte

// current block is buffered until flush
blockBuf *bytes.Buffer
blockCount int64
blockEnc *BinaryEncoder
blockEnc *binaryEncoder
}

// NewDataFileWriter creates a new DataFileWriter for given output and schema using the given DatumWriter to write the data to that Writer.
// May return an error if writing fails.
func NewDataFileWriter(output io.Writer, schema Schema, datumWriter DatumWriter) (writer *DataFileWriter, err error) {
encoder := NewBinaryEncoder(output)
datumWriter.SetSchema(schema)
encoder := newBinaryEncoder(output)
switch w := datumWriter.(type) {
case *SpecificDatumWriter:
w.SetSchema(schema)
case *GenericDatumWriter:
w.SetSchema(schema)
}

sync := []byte("1234567890abcdef") // TODO come up with other sync value

header := &objFileHeader{
Expand All @@ -232,7 +239,7 @@ func NewDataFileWriter(output io.Writer, schema Schema, datumWriter DatumWriter)
datumWriter: datumWriter,
sync: sync,
blockBuf: blockBuf,
blockEnc: NewBinaryEncoder(blockBuf),
blockEnc: newBinaryEncoder(blockBuf),
}

return
Expand Down
16 changes: 8 additions & 8 deletions datum_reader.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@ import (
// terms of the Apache license, see LICENSE for details.
// ***********************

// Reader is an interface that may be implemented to avoid using runtime reflection during deserialization.
// Unmarshaler is an interface that may be implemented to avoid using runtime reflection during deserialization.
// Implementing it is optional and may be used as an optimization. Falls back to using reflection if not implemented.
type Reader interface {
Read(dec Decoder) error
type Unmarshaler interface {
UnmarshalAvro(dec Decoder) error
}

// DatumReader is an interface that is responsible for reading structured data according to schema from a decoder
Expand Down Expand Up @@ -106,16 +106,16 @@ func (reader *SpecificDatumReader) SetSchema(schema Schema) {
// your struct field as follows: SomeValue int32 `avro:"some_field"`).
// May return an error indicating a read failure.
func (reader *SpecificDatumReader) Read(v interface{}, dec Decoder) error {
if reader, ok := v.(Reader); ok {
return reader.Read(dec)
if reader, ok := v.(Unmarshaler); ok {
return reader.UnmarshalAvro(dec)
}

rv := reflect.ValueOf(v)
if rv.Kind() != reflect.Ptr || rv.IsNil() {
return errors.New("Not applicable for non-pointer types or nil")
}
if reader.schema == nil {
return SchemaNotSet
return ErrSchemaNotSet
}
return reader.fillRecord(reader.schema, rv, dec)
}
Expand Down Expand Up @@ -401,7 +401,7 @@ func (reader *GenericDatumReader) Read(v interface{}, dec Decoder) error {
}
rv = rv.Elem()
if reader.schema == nil {
return SchemaNotSet
return ErrSchemaNotSet
}

//read the value
Expand Down Expand Up @@ -579,7 +579,7 @@ func (reader *GenericDatumReader) mapUnion(field Schema, dec Decoder) (interface
return reader.readValue(union, dec)
}

return nil, UnionTypeOverflow
return nil, ErrUnionTypeOverflow
}

func (reader *GenericDatumReader) mapFixed(field Schema, dec Decoder) ([]byte, error) {
Expand Down
Loading

0 comments on commit 3f67a23

Please sign in to comment.