-
-
Notifications
You must be signed in to change notification settings - Fork 34.3k
Description
Bug report
Bug description:
After gh-126835 moved constant folding from AST to CFG (specifically gh-130769 for tuple folding and gh-129550 for unary/binary op folding), compiling files with large constant tuples suffers from an O(N²) slowdown.
The root cause is add_const() in Python/flowgraph.c — it does a linear scan over the consts list to find a constant's index by identity:
for (index = 0; index < PyList_GET_SIZE(consts); index++) {
if (PyList_GET_ITEM(consts, index) == newconst) {
break;
}
}Before gh-126835, large constant tuples were folded in one shot at the AST stage, so add_const was rarely called with big consts lists. After gh-130769 moved tuple folding to CFG, fold_tuple_of_constants() calls add_const() once per inner tuple — N calls × O(N) scan = O(N²).
This also affects unary/binary op folding (fold_const_unaryop, fold_const_binop) moved to CFG in gh-129550:
(-1, -2, ..., -N)— eachUNARY_NEGATIVEfolded viaadd_const, 31.7x slower than 3.11 at N=100K(0+1, 0+2, ..., 0+N)— eachBINARY_OPfolded viaadd_const, 9.7x slower at N=100K
perf profiling shows add_const taking 83.76% of CPU time at N=100K.
Performance impact (N=100K):
| Scenario | 3.11 | 3.14 (regression) | Slowdown |
|---|---|---|---|
Nested tuples ((f, f), ...) |
1.24s | 10.38s | 8.4x |
Unary neg (-1, -2, ..., -N) |
0.22s | 6.86s | 31.7x |
Binary add (0+1, 0+2, ..., 0+N) |
0.27s | 2.59s | 9.7x |
Reproducer:
import sys, time
N = int(sys.argv[1]) if len(sys.argv) > 1 else 100000
elements = ", ".join(f"({i * 0.001:.6f}, {i * 0.002:.6f})" for i in range(N))
source = f"vertices = ({elements},)"
t0 = time.perf_counter()
code_obj = compile(source, "<test>", "exec")
t1 = time.perf_counter()
print(f"Compile time: {t1-t0:.4f}s", flush=True)CPython versions tested on:
3.14, 3.15
Operating systems tested on:
Linux