Conversation
philrz
left a comment
There was a problem hiding this comment.
On this branch at commit ac39eb7 I tried to run this on attached sample data bench2.csv and it panicked.
$ super -version
Version: v0.2.0-21-gac39eb7a2
$ super -c "infer" bench2.csv
panic: runtime error: invalid memory address or nil pointer dereference
goroutine 1 [running]:
runtime/debug.Stack()
/usr/local/opt/go/libexec/src/runtime/debug/stack.go:26 +0x5e
github.com/brimdata/super/runtime/sam/op.(*Catcher).Pull.func1()
/Users/phil/work/super/runtime/sam/op/catcher.go:25 +0x3d
panic({0xecbb400?, 0xf114d00?})
/usr/local/opt/go/libexec/src/runtime/panic.go:860 +0x13a
github.com/brimdata/super/runtime/sam/op/infer.(*inferNode).load(0x3532d1f74f90, {0xef991b0?, 0x3532d21fc4b0}, {0x3532d2080000?, 0xbe8e80f?, 0x3532d22876d8?})
/Users/phil/work/super/runtime/sam/op/infer/infer.go:71 +0x23a
github.com/brimdata/super/runtime/sam/op/infer.(*converter).infer(0x3532d21f3ec0, {0x3532d1df4c08, 0x64, 0x97})
/Users/phil/work/super/runtime/sam/op/infer/op.go:176 +0xef
github.com/brimdata/super/runtime/sam/op/infer.(*converter).drain(0x3532d21f3ec0, 0x3532d1f744b0, 0x0)
/Users/phil/work/super/runtime/sam/op/infer/op.go:115 +0x12a
github.com/brimdata/super/runtime/sam/op/infer.(*converter).process(0x3532d21f3ec0, {0xefa6530, 0x3532d203c0c0})
/Users/phil/work/super/runtime/sam/op/infer/op.go:99 +0x1a6
github.com/brimdata/super/runtime/sam/op/infer.(*Op).Pull(0x3532d21f3ce0, 0x59?)
/Users/phil/work/super/runtime/sam/op/infer/op.go:57 +0x1d0
github.com/brimdata/super/runtime/sam/op.(*Single).Pull(0x3532d21f3d40, 0x40?)
/Users/phil/work/super/runtime/sam/op/mux.go:159 +0x33
github.com/brimdata/super/runtime/sam/op.(*Catcher).Pull(0x3532d211b9e8?, 0x25?)
/Users/phil/work/super/runtime/sam/op/catcher.go:28 +0x5c
github.com/brimdata/super/runtime/exec.(*Query).Pull(0xbe22a7f?, 0x40?)
/Users/phil/work/super/runtime/exec/query.go:49 +0x3c
github.com/brimdata/super/sbuf.CopyMux(0x3532d2287d08, {0xef901e0, 0x3532d21f3d70})
/Users/phil/work/super/sbuf/mux.go:39 +0x38
github.com/brimdata/super/cmd/super/root.(*Command).Run(0x3532d1f1a488, {0x3532d1c140f0, 0x1, 0x1})
/Users/phil/work/super/cmd/super/root/command.go:109 +0x9f9
github.com/brimdata/super/pkg/charm.path.run({0x3532d1bc6a78, 0x1, 0x1}, {0x3532d1c140f0, 0x1, 0x0?})
/Users/phil/work/super/pkg/charm/path.go:11 +0x7b
github.com/brimdata/super/pkg/charm.(*Spec).Exec(0xf1986c0, {0x3532d1c140d0, 0x3, 0x3})
/Users/phil/work/super/pkg/charm/charm.go:74 +0x1fa
main.main()
/Users/phil/work/super/cmd/super/main.go:39 +0x5b
| * [time](../types/time.md), or | ||
| * [bool](../types/bool.md). | ||
|
|
||
| `int64` inference takes precedence over `float64. All of the other candidate types |
There was a problem hiding this comment.
| `int64` inference takes precedence over `float64. All of the other candidate types | |
| `int64` inference takes precedence over `float64`. All of the other candidate types |
| then computes an inferred type for the sample, where the inferred type is identical | ||
| to the input type except for any embedded string types inferred to be of a candidate type. | ||
| Such inference occurs when all of the values contained by that string type | ||
| are uniformly coercable to the candidate type, which may be one of: |
There was a problem hiding this comment.
| are uniformly coercable to the candidate type, which may be one of: | |
| are uniformly coercible to the candidate type, which may be one of: |
| are unambiguous with one another. | ||
|
|
||
| If end of input is reached before collecting the desired sample size, then | ||
| the inferences is conducted on the available values. |
There was a problem hiding this comment.
| the inferences is conducted on the available values. | |
| the inference is conducted on the available values. |
| the inferences is conducted on the available values. | ||
|
|
||
| Once a type is inferred for a given sample, the values are cast to that type | ||
| and output by the operator. If the inferred type is unchanged, then then the values |
There was a problem hiding this comment.
| and output by the operator. If the inferred type is unchanged, then then the values | |
| and output by the operator. If the inferred type is unchanged, then the values |
| and output by the operator. If the inferred type is unchanged, then then the values | ||
| are output unmodified. | ||
|
|
||
| The operator may be reorder values as they are collected into a sample and analyzed. |
There was a problem hiding this comment.
| The operator may be reorder values as they are collected into a sample and analyzed. | |
| The operator may reorder values as they are collected into a sample and analyzed. |
|
I noticed something else interesting when testing this branch again now at commit 7a05d0f. While right now I've found is that this works as expected: But when one of the numbers is very large, I suspect this may be down to the way the larger value gets formatted as scientific notation when rendered. Though a straight |
|
I don't think infer should infer "9.2310104e+07" as an integer. We can solve the problem a different way by inferring float64's (which typically come from JSON numbers) as int64 when the inference sample all rounds to integers. My intentions was that we do this on a subsequent PR. |
| @@ -0,0 +1,112 @@ | |||
| spq: values [1,"2"] | infer | |||
|
|
|||
There was a problem hiding this comment.
Add vector: true to these tests
No description provided.