evalv3: OOM and signal killed (64GB) #3334

uhthomas · 2024-07-28T18:08:21Z

What version of CUE are you using (`cue version`)?

$ go run cuelang.org/go/cmd/cue@v0.10.0-alpha.2.0.20240726151100-e68bd632c29e version
cue version v0.10.0-alpha.2.0.20240726151100-e68bd632c29e

go version go1.22.5
      -buildmode exe
       -compiler gc
     CGO_ENABLED 1
          GOARCH amd64
            GOOS linux
         GOAMD64 v1
cue.lang.version v0.10.0

Does this issue reproduce with the latest stable release?

Yes.

What did you do?

After following this to fix another issue, I thought it would be interesting to try the new evaluator again.

#3330 (comment)

❯ CUE_EXPERIMENT=evalv3 go run cuelang.org/go/cmd/cue@v0.10.0-alpha.2.0.20240726151100-e68bd632c29e export ./k8s/amour/list.cue
signal: killed

uhthomas/automata@5a1a4fe

What did you expect to see?

Successful build.

What did you see instead?

OOM, signal killed.

The text was updated successfully, but these errors were encountered:

uhthomas · 2024-07-28T18:26:22Z

I wrote this quick script to see if I could find anything obvious, but it doesn't appear that any individual directory causes this. It seems maybe to be a culmination of all of them? There are some pretty large spikes in memory usage.

❯ for dir in */; echo "$dir:" && CUE_EXPERIMENT=evalv3 go run cuelang.org/go/cmd/cue@v0.10.0-alpha.2.0.20240726151100-e68bd632c29e export "./$dir"

Some of those spikes are like 10 gigabytes....

uhthomas · 2024-07-28T20:48:19Z

If it helps - the latest stable release without using evalv3 takes at most 2GB for everything all at once (cue export ./k8s/amour/list.cue). Pretty major regression here.

❯ cue version
cue version 0.9.2

go version go1.22.5
      -buildmode pie
       -compiler gc
       -trimpath true
  DefaultGODEBUG httplaxcontentlength=1,httpmuxgo121=1,tls10server=1,tlsrsakex=1,tlsunsafeekm=1
     CGO_ENABLED 1
          GOARCH amd64
            GOOS linux
         GOAMD64 v1
cue.lang.version v0.9.2

uhthomas · 2024-07-28T21:15:38Z

I don't know if this pprof info is helpful.

❯ go tool pprof http://localhost:8080/debug/pprof/heap
Fetching profile over HTTP from http://localhost:8080/debug/pprof/heap
Saved profile in /home/thomas/pprof/pprof.cuebin.alloc_objects.alloc_space.inuse_objects.inuse_space.011.pb.gz
File: cuebin
Type: inuse_space
Time: Jul 28, 2024 at 10:41pm (BST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 15.18GB, 97.93% of 15.51GB total
Dropped 187 nodes (cum <= 0.08GB)
Showing top 10 nodes out of 54
      flat  flat%   sum%        cum   cum%
    6.28GB 40.51% 40.51%     6.28GB 40.51%  cuelang.org/go/internal/core/adt.(*overlayContext).allocCC
    3.88GB 25.04% 65.55%     5.95GB 38.36%  cuelang.org/go/internal/core/adt.(*closeContext).getKeyedCC
    2.17GB 13.97% 79.52%     2.17GB 13.97%  cuelang.org/go/internal/core/adt.(*OpContext).newNodeContext
    1.10GB  7.12% 86.64%     1.10GB  7.12%  cuelang.org/go/internal/core/adt.(*closeContext).addDependency
    0.96GB  6.21% 92.85%     5.99GB 38.60%  cuelang.org/go/internal/core/adt.(*closeContext).assignConjunct
    0.29GB  1.86% 94.71%     0.29GB  1.86%  cuelang.org/go/internal/core/adt.(*overlayContext).initCloneCC
    0.18GB  1.16% 95.87%     0.18GB  1.16%  cuelang.org/go/internal/core/adt.(*Vertex).rootCloseContext
    0.12GB   0.8% 96.67%     1.28GB  8.26%  cuelang.org/go/internal/core/adt.(*overlayContext).cloneVertex
    0.10GB  0.66% 97.32%     0.10GB  0.66%  cuelang.org/go/internal/core/adt.(*nodeContext).getArc
    0.09GB  0.61% 97.93%     0.10GB  0.66%  cuelang.org/go/internal/core/adt.(*nodeContext).scheduleTask
(pprof) list allocCC
Total: 15.51GB
ROUTINE ======================== cuelang.org/go/internal/core/adt.(*overlayContext).allocCC in /home/thomas/code/github.com/cue-lang/cue/internal/core/adt/overlay.go
    6.28GB    15.11GB (flat, cum) 97.42% of Total
         .          .    259:func (ctx *overlayContext) allocCC(cc *closeContext) *closeContext {
         .          .    260:   // TODO(perf): if the original is "done", it can no longer be modified and
         .          .    261:   // we can use the original, even if the values will not be correct.
         .          .    262:   if cc.overlay != nil {
         .          .    263:           return cc.overlay
         .          .    264:   }
         .          .    265:
    4.59GB     4.59GB    266:   o := &closeContext{generation: ctx.generation}
         .          .    267:   cc.overlay = o
         .          .    268:
         .          .    269:   if cc.parent != nil {
         .     6.15GB    270:           o.parent = ctx.allocCC(cc.parent)
         .          .    271:   }
         .          .    272:
         .          .    273:   // Copy the conjunct group if it exists.
         .          .    274:   if cc.group != nil {
         .          .    275:           // Copy the group of conjuncts.
    1.67GB     1.67GB    276:           g := make([]Conjunct, len(*cc.group))
         .          .    277:           o.group = (*ConjunctGroup)(&g)
         .          .    278:           for i, c := range *cc.group {
         .     1.50MB    279:                   g[i] = ctx.copyConjunct(c)
         .          .    280:           }
         .          .    281:
         .          .    282:           if o.parent != nil {
         .          .    283:                   // validate invariants
         .          .    284:                   ca := *cc.parent.group
         .          .    285:                   if ca[cc.parentIndex].x != cc.group {
         .          .    286:                           panic("group misaligned")
         .          .    287:                   }
         .          .    288:
         .          .    289:                   (*o.parent.group)[cc.parentIndex].x = o.group
         .          .    290:           }
         .          .    291:   }
         .          .    292:
         .          .    293:   // This must come after allocating the parent so that we can always read
         .          .    294:   // the src vertex from the parent during initialization. This assumes that
         .          .    295:   // src is set in the root closeContext when cloning a vertex.
   21.23MB    21.23MB    296:   ctx.closeContexts = append(ctx.closeContexts, cc)
         .          .    297:
         .          .    298:   // needsCloseInSchedule is used as a boolean. The pointer to the original
         .          .    299:   // closeContext is just used for reporting purposes.
         .          .    300:   if cc.needsCloseInSchedule != nil {
         .   128.57MB    301:           o.needsCloseInSchedule = ctx.allocCC(cc.needsCloseInSchedule)
         .          .    302:   }
         .          .    303:
         .          .    304:   // We only explicitly tag dependencies of type ARC. Notifications that
         .          .    305:   // point within the disjunct overlay will be tagged elsewhere.
         .          .    306:   for _, a := range cc.arcs {
         .          .    307:           if a.kind == ARC {
         .     2.55GB    308:                   ctx.allocCC(a.cc)
         .          .    309:           }
         .          .    310:   }
         .          .    311:
         .          .    312:   return o
         .          .    313:}
(pprof) list getKeyedCC
Total: 15.51GB
ROUTINE ======================== cuelang.org/go/internal/core/adt.(*closeContext).getKeyedCC in /home/thomas/code/github.com/cue-lang/cue/internal/core/adt/fields.go
    3.88GB    10.72GB (flat, cum) 69.15% of Total
         .          .    362:func (cc *closeContext) getKeyedCC(ctx *OpContext, key *closeContext, c CycleInfo, mode ArcType, checkClosed bool) *closeContext {
         .          .    363:   for _, a := range cc.arcs {
         .          .    364:           if a.key == key {
         .          .    365:                   a.cc.updateArcType(mode)
         .          .    366:                   return a.cc
         .          .    367:           }
         .          .    368:   }
         .          .    369:
  360.01MB   360.01MB    370:   group := &ConjunctGroup{}
         .          .    371:
         .          .    372:   if cc.parentConjuncts == cc {
         .          .    373:           panic("parent is self")
         .          .    374:   }
         .          .    375:
         .     5.74GB    376:   parent, pos, _ := cc.parentConjuncts.assignConjunct(ctx, key, Conjunct{
         .          .    377:           CloseInfo: CloseInfo{
         .          .    378:                   FromDef:   cc.isDef,
         .          .    379:                   FromEmbed: cc.isEmbed,
         .          .    380:                   CycleInfo: c,
         .          .    381:           },
         .          .    382:           x: group,
         .          .    383:   }, mode, false, checkClosed)
         .          .    384:
    3.53GB     3.53GB    385:   arc := &closeContext{
         .          .    386:           generation:      cc.generation,
         .          .    387:           parent:          parent,
         .          .    388:           parentConjuncts: parent,
         .          .    389:           parentIndex:     pos,
         .          .    390:
         .          .    391:           src:     key.src,
         .          .    392:           arcType: mode,
         .          .    393:           group:   group,
         .          .    394:
         .          .    395:           isDef:                cc.isDef,
         .          .    396:           isEmbed:              cc.isEmbed,
         .          .    397:           needsCloseInSchedule: cc,
         .          .    398:   }
         .          .    399:
         .          .    400:   arc.parent.incDependent(ctx, PARENT, arc)
         .          .    401:
         .          .    402:   // If the parent, w.r.t. the subfield relation was already processed,
         .          .    403:   // there is no need to register the notification.
         .          .    404:   arc.incDependent(ctx, EVAL, cc) // matched in REF(decrement:nodeDone)
         .          .    405:
         .          .    406:   // A let field never depends on its parent. So it is okay to filter here.
         .          .    407:   if !arc.Label().IsLet() {
         .          .    408:           // prevent a dependency on self.
         .          .    409:           if key.src != cc.src {
         .     1.10GB    410:                   cc.addDependency(ctx, ARC, key, arc, key)
         .          .    411:           }
         .          .    412:   }
         .          .    413:
         .          .    414:   v := key.src
         .          .    415:   if checkClosed && v.Parent != nil && v.Parent.state != nil {

mvdan · 2024-10-09T11:08:57Z

Some updated numbers, as of CUE 655236e:

$ time CUE_EXPERIMENT=evalv3=0 cue export ./k8s/amour/list.cue >/dev/null

real	0m10.683s
user	0m18.254s
sys	0m0.456s
$ time CUE_EXPERIMENT=evalv3=1 cue export ./k8s/amour/list.cue >/dev/null
^C (stopped before it eats all my RAM)

real	0m26.453s
user	1m13.289s
sys	0m14.437s

We see a similar issue with a few other medium to large projects when using the new evaluator. The current work in #2854 and particularly #2853 should help significantly here. I will update this thread once more progress has been made, as we are actively merging improvements into master.

mvdan · 2024-10-14T14:50:06Z

I have finally been able to reduce a bit of CUE which seems to OOM my machine where evalv2 definitely did not: #3509

mvdan · 2024-10-14T16:36:38Z

#3509 is fixed but that doesn't fix your issue. Quickly trying to narrow down its cause, I ran into a related regression, #3511.

mvdan · 2024-12-12T12:56:13Z

Still seems to happen as of 0005c22.

mvdan · 2024-12-24T11:43:33Z

I again reduced part of this project into another performance issue to be fixed: #3633

mvdan · 2025-02-26T21:16:02Z

Here is another reduction, similar to #3633, yet not the same:

import "list"

items: [...#JSONSchemaProps]

#JSONSchemaProps: {
    props?: [string]: #JSONSchemaProps

    repeat0?: [...#JSONSchemaProps]
    repeat1?: [...#JSONSchemaProps]
    repeat2?: [...#JSONSchemaProps]
    repeat3?: [...#JSONSchemaProps]
    repeat4?: [...#JSONSchemaProps]
    repeat5?: [...#JSONSchemaProps]
    repeat6?: [...#JSONSchemaProps]
    repeat7?: [...#JSONSchemaProps]
    repeat8?: [...#JSONSchemaProps]
    repeat9?: [...#JSONSchemaProps]
}

items: list.Repeat([{
    props: a1: props: a2: props: a3: props: a4: props: a5: {}
    props: b1: props: b2: props: b3: props: b4: props: b5: {}
    props: c1: props: c2: props: c3: props: c4: props: c5: {}
    props: d1: props: d2: props: d3: props: d4: props: d5: {}
    props: e1: props: e2: props: e3: props: e4: props: e5: {}
    props: f1: props: f2: props: f3: props: f4: props: f5: {}
    props: g1: props: g2: props: g3: props: g4: props: g5: {}
    props: h1: props: h2: props: h3: props: h4: props: h5: {}
    props: i1: props: i2: props: i3: props: i4: props: i5: {}
    props: j1: props: j2: props: j3: props: j4: props: j5: {}
}], 50)

As of 0b36c4e, this takes 80ms to cue export with evalv2, and 600ms with evalv3.

mvdan · 2025-03-18T15:59:24Z

Significant progress; I am now able to finish evaluating with evalv3, even if it's still quite a bit slower than evalv2:

$ for n in 0 1; do time CUE_EXPERIMENT=evalv3=$n cue export ./k8s/amour/list.cue | wc -l; done
97333

real	0m5.992s
user	0m11.232s
sys	0m0.224s
97333

real	1m44.507s
user	2m54.228s
sys	0m8.351s

With @mpvl we believe the remaining work here is memory management, which evalv2 did to some extent, but evalv3 does not do yet. This is also why evalv3 above takes more than 20GiB of memory to finish.

mvdan · 2025-04-08T10:20:11Z

I ran this again, as of the latest version of the automata and cue repos, and got the following stats:

$ for n in 0 1; do time CUE_EXPERIMENT=evalv3=$n CUE_STATS_FILE=- cue export ./k8s/magiclove/list.cue | wc -l; done
{
    "CUE": {
        "EvalVersion": 2,
        "Unifications": 3532721,
        "Disjuncts": 5926895,
        "Conjuncts": 6258147,
        "Freed": 5293289,
        "Reused": 5293251,
        "Allocs": 1189,
        "Retained": 646582
    },
    "Go": {
        "AllocBytes": 3121100688,
        "AllocObjects": 39615227
    }
}
118599

real	0m6.959s
user	0m11.656s
sys	0m0.285s
{
    "CUE": {
        "EvalVersion": 3,
        "Unifications": 3203181,
        "Disjuncts": 2118830,
        "Conjuncts": 5993002,
        "Freed": 0,
        "Reused": 0,
        "Allocs": 6500877,
        "Retained": 0
    },
    "Go": {
        "AllocBytes": 115424217200,
        "AllocObjects": 705573132
    }
}
118599

real	2m55.055s
user	4m16.801s
sys	0m11.064s

mvdan · 2025-04-08T17:05:56Z

I think this issue overlaps with #3881 too; the pprof shows me:

      flat  flat%   sum%        cum   cum%
    3530ms 25.05% 25.05%     3530ms 25.05%  cuelang.org/go/internal/core/adt.transitiveMapping
    1930ms 13.70% 38.75%     3610ms 25.62%  runtime.scanobject
    1890ms 13.41% 52.16%     6580ms 46.70%  cuelang.org/go/internal/core/adt.(*reqSets).replaceIDs
    1230ms  8.73% 60.89%     1420ms 10.08%  cuelang.org/go/internal/core/adt.(*reqSets).filterTop

So, even though we waste at least 25% on GC overhead from allocations, we waste at least 46% on the reqSets logic, which as shown in #3881 has a TODO about a naive algorithm potentially being a bottleneck.

Benchmarking the reduction at https://cuelang.org/issue/3334 via benchcmd -n 12 CueVetIssue3334 cue vet -c f.cue: │ old │ new │ │ sec/op │ sec/op vs base │ CueVetIssue3334 175.3m ± 1% 154.9m ± 1% -11.67% (p=0.000 n=12) │ old │ new │ │ user-sec/op │ user-sec/op vs base │ CueVetIssue3334 332.1m ± 4% 270.8m ± 5% -18.46% (p=0.000 n=12) │ old │ new │ │ sys-sec/op │ sys-sec/op vs base │ CueVetIssue3334 27.61m ± 16% 21.92m ± 30% ~ (p=0.089 n=12) │ old │ new │ │ peak-RSS-bytes │ peak-RSS-bytes vs base │ CueVetIssue3334 182.2Mi ± 4% 177.4Mi ± 1% -2.61% (p=0.000 n=12) Updates #3334. Signed-off-by: Daniel Martí <mvdan@mvdan.cc> Change-Id: I1f066de341e8fd2e9d508304e3de1b86234f60a6 Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1213055 Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com> TryBot-Result: CUEcueckoo <cueckoo@cuelang.org> Reviewed-by: Marcel van Lohuizen <mpvl@gmail.com>

The getReplacements call was creating an entirely new slice each time, as can be seen by the inline perf TODO. We can however avoid most of that by reusing a local buffer, not needing to alter nodeContext or OpContext at all just yet. As measured by benchcmd -n 6 VetAutomataList cue vet -c k8s/magiclove/list.cue on a simplified version of https://github.com/uhthomas/automata, we see a significant win: │ old │ new │ │ sec/op │ sec/op vs base │ VetAutomataList 3.001 ± 1% 2.877 ± 2% -4.12% (p=0.002 n=6) │ old │ new │ │ user-sec/op │ user-sec/op vs base │ VetAutomataList 3.979 ± 6% 3.769 ± 6% -5.29% (p=0.026 n=6) │ old │ new │ │ sys-sec/op │ sys-sec/op vs base │ VetAutomataList 99.79m ± 11% 93.96m ± 8% ~ (p=0.132 n=6) │ old │ new │ │ peak-RSS-bytes │ peak-RSS-bytes vs base │ VetAutomataList 789.5Mi ± 4% 743.2Mi ± 6% -5.87% (p=0.026 n=6) Updates #3334. Signed-off-by: Daniel Martí <mvdan@mvdan.cc> Change-Id: Ia598d49e9ddd13c687315fc1726160c7f6af8800 Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1213493 Reviewed-by: Marcel van Lohuizen <mpvl@gmail.com> TryBot-Result: CUEcueckoo <cueckoo@cuelang.org>

In looking at why https://github.com/uhthomas/automata was spending so much time in checkTypos, I realised that we were doing a lot of work in the initial appendRequired call, even though the main v.Arcs loop did nothing as len(v.Arcs) == 0. Detect that edge case early, which can be very common in some projects, and avoid doing any further work. As measured by benchcmd -n 6 VetAutomataList cue vet -c k8s/magiclove/list.cue on a simplified version of https://github.com/uhthomas/automata, we see a very substantial win: │ old │ new │ │ sec/op │ sec/op vs base │ VetAutomataList 2877.5m ± 2% 722.1m ± 2% -74.91% (p=0.002 n=6) │ old │ new │ │ user-sec/op │ user-sec/op vs base │ VetAutomataList 3.769 ± 6% 1.231 ± 9% -67.34% (p=0.002 n=6) │ old │ new │ │ sys-sec/op │ sys-sec/op vs base │ VetAutomataList 93.96m ± 8% 79.04m ± 15% -15.88% (p=0.015 n=6) │ old │ new │ │ peak-RSS-bytes │ peak-RSS-bytes vs base │ VetAutomataList 743.2Mi ± 6% 672.5Mi ± 3% -9.51% (p=0.002 n=6) The full automata project also sees a similar speed-up, going from about 180s total time to evaluate to about 30s. Updates #3334. Signed-off-by: Daniel Martí <mvdan@mvdan.cc> Change-Id: I24c9a7bdac496608799fc72f6abd4c49eaf32180 Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1213494 TryBot-Result: CUEcueckoo <cueckoo@cuelang.org> Reviewed-by: Marcel van Lohuizen <mpvl@gmail.com>

mvdan · 2025-04-15T10:08:51Z

With https://review.gerrithub.io/c/cue-lang/cue/+/1213494, we are now looking much better, even if still a bit slower:

$ time CUE_EXPERIMENT=evalv3=0 cue vet -c ./k8s/magiclove/list.cue

real	0m6.335s
user	0m10.852s
sys	0m0.247s
$ time CUE_EXPERIMENT=evalv3=1 cue vet -c ./k8s/magiclove/list.cue

real	0m27.729s
user	0m49.371s
sys	0m2.698s

Memory usage still peaked above 20GiB, but at least the wall time and CPU usage is much lower now.

nodeContext's overlays and disjunct fields were added and set in https://cuelang.org/cl/1190800 but were never used. Peak memory usage on a simplified version of https://github.com/uhthomas/automata goes down a bit, as most of the allocations come from cloning vertices and contexts, so the size of these types is multiplied: │ old │ new │ │ peak-RSS-bytes │ peak-RSS-bytes vs base │ VetAutomataList 669.5Mi ± 3% 629.0Mi ± 3% -6.06% (p=0.000 n=8) Updates #3334. Signed-off-by: Daniel Martí <mvdan@mvdan.cc> Change-Id: I7ebf8db1f72cee7e4246ad10c4c0d9864ea3110e Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1213518 Reviewed-by: Marcel van Lohuizen <mpvl@gmail.com> Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com> TryBot-Result: CUEcueckoo <cueckoo@cuelang.org>

walkVisitor was a wrapper around ast.Walk which kept a stack of visitors to be saved and reused as we walked up and down a syntax tree. Resolve also used a trick to only walk parts of some syntax nodes. Rather than returning a visitor for a node, it would recursively call walkVisitor only for the fields that it wanted to walk into. However, these two compounded poorly; each recursive call to walkVisitor caused new allocations to make a new stackVisitor to carry state. Even though these allocations are small, they can add up into noticeable overhead, particularly for the garbage collector. Instead, we can rely on ast.Walk rather than walkVisitor, while keeping the same trick to only walk parts of certain syntax nodes. This keeps the same semantics, but avoids the extra allocations. │ old │ new │ │ B/op │ B/op vs base │ VetAutomataList 734.9Mi ± 0% 733.3Mi ± 0% -0.22% (p=0.000 n=10) │ old │ new │ │ allocs/op │ allocs/op vs base │ VetAutomataList 3.424M ± 0% 3.361M ± 0% -1.85% (p=0.000 n=10) For #3334. Signed-off-by: Daniel Martí <mvdan@mvdan.cc> Change-Id: Ibb84e22bf7cfdf8c6e2c6122ab21aed9934c13cc Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1214080 TryBot-Result: CUEcueckoo <cueckoo@cuelang.org> Reviewed-by: Matthew Sackman <matthew@cue.works> Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com>

slices.Clone is an efficient way to make a copy of a slice, but it does create a new slice from scratch each time. In this case we are looping and each copy is only used as part of each iteration, so we can preallocate one slice big enough to make the copies and reuse that across all iterations. │ old │ new │ │ B/op │ B/op vs base │ VetAutomataList 733.3Mi ± 0% 663.1Mi ± 0% -9.57% (p=0.000 n=10) │ old │ new │ │ allocs/op │ allocs/op vs base │ VetAutomataList 3.361M ± 0% 3.308M ± 0% -1.58% (p=0.000 n=10) For #3334. Signed-off-by: Daniel Martí <mvdan@mvdan.cc> Change-Id: Id12844a4c5f417d095521783bfc929985f23ade2 Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1214081 Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com> TryBot-Result: CUEcueckoo <cueckoo@cuelang.org> Reviewed-by: Matthew Sackman <matthew@cue.works>

apd.Decimal.Int64 on `d = hi - lo` will error if it overflows an int64. This is pretty common with CUE bounds like int64, which expands to: >=-9_223_372_036_854_775_808 & <=9_223_372_036_854_775_807 Constructing that error is unfortunate as it allocates a few times and stringifies the number too, which also has a cost. Which is entirely unnecessary, as we don't use the error value at all. If we know the integer will have more than one digit, give up early. │ old │ new │ │ B/op │ B/op vs base │ VetAutomataList 663.1Mi ± 0% 659.2Mi ± 0% -0.59% (p=0.000 n=10) │ old │ new │ │ allocs/op │ allocs/op vs base │ VetAutomataList 3.308M ± 0% 3.163M ± 0% -4.37% (p=0.000 n=10) Note that this requires us to move the d.Negative check earlier, as it should not be affected by the number of digits in d. Introduce the helper errIncompatibleBounds to avoid duplication. Combining this and the two previous allocation optimizations, we can see a noticeable drop in the wall time: │ old │ new │ │ sec/op │ sec/op vs base │ VetAutomataList 642.4m ± 3% 618.7m ± 1% -3.69% (p=0.000 n=10) │ old │ new │ │ B/op │ B/op vs base │ VetAutomataList 734.9Mi ± 0% 659.3Mi ± 0% -10.29% (p=0.000 n=10) │ old │ new │ │ allocs/op │ allocs/op vs base │ VetAutomataList 3.424M ± 0% 3.163M ± 0% -7.63% (p=0.000 n=10) For #3334. Signed-off-by: Daniel Martí <mvdan@mvdan.cc> Change-Id: Ic0ba6031b0ace66bc1418ce76e745650021d22a0 Reviewed-on: https://review.gerrithub.io/c/cue-lang/cue/+/1214082 Reviewed-by: Matthew Sackman <matthew@cue.works> Unity-Result: CUE porcuepine <cue.porcuepine@gmail.com> TryBot-Result: CUEcueckoo <cueckoo@cuelang.org>

uhthomas added NeedsInvestigation Triage Requires triage/attention labels Jul 28, 2024

uhthomas added a commit to uhthomas/automata that referenced this issue Jul 28, 2024

https://github.com/cue-lang/cue/issues/3334

5a1a4fe

mvdan added panic and removed Triage Requires triage/attention labels Jul 28, 2024

myitcv added the evaluator label Jul 29, 2024

mvdan removed the panic label Oct 9, 2024

mvdan added the evalv3 issues affecting only the evaluator version 3 label Dec 10, 2024

myitcv added tmp/eval and removed tmp/eval labels Feb 15, 2025

mvdan added the unity-win bugs found thanks to projects added to Unity label Feb 26, 2025

mvdan mentioned this issue Apr 20, 2025

Performance and scaling issue #3892

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evalv3: OOM and signal killed (64GB) #3334

evalv3: OOM and signal killed (64GB) #3334

uhthomas commented Jul 28, 2024 •

edited

Loading

uhthomas commented Jul 28, 2024 •

edited

Loading

uhthomas commented Jul 28, 2024

uhthomas commented Jul 28, 2024 •

edited

Loading

mvdan commented Oct 9, 2024 •

edited

Loading

mvdan commented Oct 14, 2024

mvdan commented Oct 14, 2024

mvdan commented Dec 12, 2024

mvdan commented Dec 24, 2024

mvdan commented Feb 26, 2025 •

edited

Loading

mvdan commented Mar 18, 2025

mvdan commented Apr 8, 2025

mvdan commented Apr 8, 2025

mvdan commented Apr 15, 2025

evalv3: OOM and signal killed (64GB) #3334

evalv3: OOM and signal killed (64GB) #3334

Comments

uhthomas commented Jul 28, 2024 • edited Loading

What version of CUE are you using (cue version)?

Does this issue reproduce with the latest stable release?

What did you do?

What did you expect to see?

What did you see instead?

uhthomas commented Jul 28, 2024 • edited Loading

uhthomas commented Jul 28, 2024

uhthomas commented Jul 28, 2024 • edited Loading

mvdan commented Oct 9, 2024 • edited Loading

mvdan commented Oct 14, 2024

mvdan commented Oct 14, 2024

mvdan commented Dec 12, 2024

mvdan commented Dec 24, 2024

mvdan commented Feb 26, 2025 • edited Loading

mvdan commented Mar 18, 2025

mvdan commented Apr 8, 2025

mvdan commented Apr 8, 2025

mvdan commented Apr 15, 2025

uhthomas commented Jul 28, 2024 •

edited

Loading

What version of CUE are you using (`cue version`)?

uhthomas commented Jul 28, 2024 •

edited

Loading

uhthomas commented Jul 28, 2024 •

edited

Loading

mvdan commented Oct 9, 2024 •

edited

Loading

mvdan commented Feb 26, 2025 •

edited

Loading