IME this is a longstanding pain point with Go. There's an attempt to propose an encoding/json/v2 package [1] being kicked around at the moment [2], spawned from a discussion [3].
This at least seems to improve the situation of marshalling to/from an interface directly slightly by providing the ability to pass custom Unmarshalers for a specific type (via json.WithUnmarshalers and json.UnmarshalFunc) to the Unmarshal functions, but it appears to still have the inefficient double-decode problem. Or I just haven't found a decent way around it yet.
Looks like they're intentionally punting on a first class solution until (if) the language gets some sort of sum type, but I still think the second-class solution could do a bit more to make this extremely common use-case more convenient. Pretty much every serious production Go app I've worked on in the last 10 years or so has had some horrible coping strategy for the "map a field-discriminated object to/from implementations of an interface" gap, often involving some sort of double-unmarshal.
Quote from the proposal [1]:
> First-class support for union types: It is common for the type of a particular JSON value to be dynamically changed based on context. This is difficult to support in Go as the equivalent of a dynamic value is a Go interface. When unmarshaling, there is no way in Go reflection to enumerate the set of possible Go types that can be stored in a Go interface in order to choose the right type to automatically unmarshal a dynamic JSON value. Support for such use cases is deferred until better Go language support exists.
Understatement of the year. But it’s really not limited to encoding but generally lack of sum types is excruciating after having tasted them (in Rust, in my case). They click instantly as an abstraction and they solve countless real-world logic bugs. Not to mention their ergonomics in seemingly unrelated things like eliminating null and error handling with result types. Just sprinkle some pattern matching on top and you’re in paradise.
That's not what I found in my own experiments, I still had to unmarshal once inside the callback to get the `type` field out, then again once I knew what the type was. Do you have an example handy?
I don't know about recently, but people were asking about them from the first announcement in 2009; and got the answer that they were "under consideration"
To be fair, it's a significant advantage of go that they have been strict about keeping it's feature set small.
Edited to add:
There is this 2023 proposal from Ian Lance Taylor (on the go team) https://github.com/golang/go/issues/57644
But it makes all sum types include nil, which seems suboptimal
Cool but all you really needed to do was fix the contract between NewActionDeleteObject’s struct creation and the switch statements result = print. What’s really crazy is you can create anonymous structs to unmarshal json without having to fully flesh out data models thanks to its “we’ll only map the fields we see in the struct” approach. Mapstructure takes this even further with squash. In the end, the type checker missed the error because it was user error by contract and that, due to lack of validation that the actions were constructed properly, resulted in a rabbit hole debugging session that ended with an excellent blog post about the gotchas of unmarshaling.
BTW double unmarshalling (and double marshalling) can be quite slow, so to speed up determining the object type you can extract the type field e.g. by using gjson (https://github.com/tidwall/gjson). It can be easily 10x faster for this kind of scenario
Decoding sum types into Go interface values is obviously tricky stuff, but it gets even harder when you have recursive data structures as in an abstract syntax tree (AST). The article doesn't address this. Since there wasn't anything out there to do this, we built a little package called "unpack" as part of the SuperDB project.
type Tag string
const (
A Tag = "A"
B Tag = "B"
)
type StructA struct {
Type Tag `json:"type"`
Count int `json:"count"`
}
type StructB struct {
Type Tag `json:"type"`
Data string `json:"data"`
}
by writing a (subjectively) minimal amount of boilerplate:
type Union = jtug.Union[Tag]
type List = jtug.UnionList[Tag, Mapper]
type Mapper struct{}
func (Mapper) Unmarshal(b []byte, t Tag) (jtug.Union[Tag], error) {
switch t {
case A:
var value StructA
return value, json.Unmarshal(b, &value)
case B:
var value StructB
return value, json.Unmarshal(b, &value)
default:
return nil, fmt.Errorf("unknown tag: \"%s\"", t)
}
}
This shows that now it's possible to use `json.Unmarshal` directly:
var list List
err := json.Unmarshal([]byte(`[
{"type":"A","count":10},
{"type":"B","data":"hello"}
]`), &list)
for i := range list {
switch t := list[i].(type) {
case StructA:
println(t.Count)
case StructB:
println(t.Data)
// etc.
}
}
Of course, it relies on reflection, and is generally not very efficient. If you control the API, and it's going to be consumed by go, then I would just not do tagged unions.
Pulled my hair out about doing this all over the place when integrating with a node api, ended up writing https://github.com/byrnedo/pjson. Feels like this should be covered as a more first class thing in go.
Reading the article I got the same conclusion as every time I approach sum types: they are ONLY useful for addressing malformed JSON structs of hacking BAD data structure/logic design, at least for most business applications (for system-level programs my reasoning is different).
The example JSON in the article, even if it may be common, is broken and I would not accept such design, because an action on an object must require the action AND the object.
For many year, I have advised companies developing business applications to avoid programming constructs (like sum types) which are very far from what a business man would understand (think of a business form in paper for the first example in the article). And the results are good, making the business logic in the program as similar as possible to the business logic in terms of business people.
IME this is a longstanding pain point with Go. There's an attempt to propose an encoding/json/v2 package [1] being kicked around at the moment [2], spawned from a discussion [3].
This at least seems to improve the situation of marshalling to/from an interface directly slightly by providing the ability to pass custom Unmarshalers for a specific type (via json.WithUnmarshalers and json.UnmarshalFunc) to the Unmarshal functions, but it appears to still have the inefficient double-decode problem. Or I just haven't found a decent way around it yet.
Looks like they're intentionally punting on a first class solution until (if) the language gets some sort of sum type, but I still think the second-class solution could do a bit more to make this extremely common use-case more convenient. Pretty much every serious production Go app I've worked on in the last 10 years or so has had some horrible coping strategy for the "map a field-discriminated object to/from implementations of an interface" gap, often involving some sort of double-unmarshal.
Quote from the proposal [1]:
> First-class support for union types: It is common for the type of a particular JSON value to be dynamically changed based on context. This is difficult to support in Go as the equivalent of a dynamic value is a Go interface. When unmarshaling, there is no way in Go reflection to enumerate the set of possible Go types that can be stored in a Go interface in order to choose the right type to automatically unmarshal a dynamic JSON value. Support for such use cases is deferred until better Go language support exists.
> IME this is a longstanding pain point with Go.
Understatement of the year. But it’s really not limited to encoding but generally lack of sum types is excruciating after having tasted them (in Rust, in my case). They click instantly as an abstraction and they solve countless real-world logic bugs. Not to mention their ergonomics in seemingly unrelated things like eliminating null and error handling with result types. Just sprinkle some pattern matching on top and you’re in paradise.
Last time I checked, the json/v2 package fixed the double decode problem by passing the decoder into the unmarshaling callback.
That's not what I found in my own experiments, I still had to unmarshal once inside the callback to get the `type` field out, then again once I knew what the type was. Do you have an example handy?
> until (if) the language gets some sort of sum type
Is there any discussion with the Go team about this actually happening?
I don't know about recently, but people were asking about them from the first announcement in 2009; and got the answer that they were "under consideration"
To be fair, it's a significant advantage of go that they have been strict about keeping it's feature set small.
Edited to add:
There is this 2023 proposal from Ian Lance Taylor (on the go team) https://github.com/golang/go/issues/57644 But it makes all sum types include nil, which seems suboptimal
Well, he advocated for and eventually got generics, so it could happen here too.
The weird JSON handling was the main reason I stopped using Go for side projects long ago.
Cool but all you really needed to do was fix the contract between NewActionDeleteObject’s struct creation and the switch statements result = print. What’s really crazy is you can create anonymous structs to unmarshal json without having to fully flesh out data models thanks to its “we’ll only map the fields we see in the struct” approach. Mapstructure takes this even further with squash. In the end, the type checker missed the error because it was user error by contract and that, due to lack of validation that the actions were constructed properly, resulted in a rabbit hole debugging session that ended with an excellent blog post about the gotchas of unmarshaling.
BTW double unmarshalling (and double marshalling) can be quite slow, so to speed up determining the object type you can extract the type field e.g. by using gjson (https://github.com/tidwall/gjson). It can be easily 10x faster for this kind of scenario
Nice article!
Decoding sum types into Go interface values is obviously tricky stuff, but it gets even harder when you have recursive data structures as in an abstract syntax tree (AST). The article doesn't address this. Since there wasn't anything out there to do this, we built a little package called "unpack" as part of the SuperDB project.
The package is here...
https://github.com/brimdata/super/blob/main/pkg/unpack/refle...
and an example use in SuperDB is here...
https://github.com/brimdata/super/blob/main/compiler/ast/unp...
Sorry it's not very well documented, but once we got it working, we found the approach quite powerful and easy.
Shameless plug: I wrote a "JSON Tagged Union" package for go: https://github.com/benjajaja/jtug
Let's you decode something like
into some go like: by writing a (subjectively) minimal amount of boilerplate: This shows that now it's possible to use `json.Unmarshal` directly: Of course, it relies on reflection, and is generally not very efficient. If you control the API, and it's going to be consumed by go, then I would just not do tagged unions.Interesting to see V mentioned here. Is it still the chaotic mess of a language that'll never be like it was a few years ago?
Pulled my hair out about doing this all over the place when integrating with a node api, ended up writing https://github.com/byrnedo/pjson. Feels like this should be covered as a more first class thing in go.
Surprising to see V lang brought up. What’s it current reputation?
Reading the article I got the same conclusion as every time I approach sum types: they are ONLY useful for addressing malformed JSON structs of hacking BAD data structure/logic design, at least for most business applications (for system-level programs my reasoning is different).
The example JSON in the article, even if it may be common, is broken and I would not accept such design, because an action on an object must require the action AND the object.
For many year, I have advised companies developing business applications to avoid programming constructs (like sum types) which are very far from what a business man would understand (think of a business form in paper for the first example in the article). And the results are good, making the business logic in the program as similar as possible to the business logic in terms of business people.