v1.25.0
Upgrade Notes
v1.25.0 raises Fission’s minimum Kubernetes version to 1.32, tightens two more admission rules around the HTTPTrigger path and the PodSpec capability allowlist, and moves several validation checks from the admission webhook to API-server CEL. Specs that rely on the rejected primitives — or clusters older than 1.32 — will fail admission after upgrade. Audit the items below before rolling out.
For the general upgrade steps (CRDs, CLI, Helm chart), see the Upgrade Guide. The breaking changes specific to v1.25.0 and the action each requires are below.
Minimum Kubernetes is now 1.32
The Helm chart’s kubeVersion is >=1.32.0-0, the envtest harness pulls 1.32.x assets, and the runtime health-check floor (MinimumKubernetesVersion) is 1.32.
Clusters older than 1.32 are no longer supported.
As part of this, the fluentbit PodSecurityPolicy manifest and the logger.podSecurityPolicy Helm values are removed — PodSecurityPolicy was removed from Kubernetes in 1.25 and cannot exist on a 1.32+ cluster.
If you relied on logger.podSecurityPolicy, use Pod Security Admission / Pod Security Standards instead.
HTTPTrigger path safety enforced at admission
HTTPTrigger.spec.relativeurl / spec.prefix are now validated by both the API server (CEL) and the Go-side Validate() path that the CLI and router status conditions use.
This closes GHSA-vchh-r53j-8mpw: an HTTPTrigger written via kubectl apply or the Kubernetes REST API could previously specify an empty path, contain .. traversal, be root-only (/), collide with router-owned routes (/router-healthz, /readyz, /_version, /auth/login), or shadow the router-internal /fission-function/<ns>/<name> prefix — the fission CLI already rejected these but kubectl bypassed them.
A trigger that violates any of these now fails admission.
The fission CLI is unchanged.
If you had any such triggers in your specs, fix the paths before upgrading.
PodSpec capabilities are an allowlist (not a denylist)
Environment and Function PodSpec admission and the merge-layer sanitizer switch from a six-entry denylist to an allowlist that matches Kubernetes Pod Security Admission’s restricted profile: only NET_BIND_SERVICE may appear in capabilities.add.
The merge layer also forces capabilities.drop = ["ALL"] on every container — including containers whose source had no SecurityContext at all — so the OCI runtime’s default cap set (CHOWN/MKNOD/NET_RAW/SETUID/…) no longer reaches tenant containers.
This closes GHSA-qf5v-m7p4-95rp: the previous denylist was structurally incomplete — SYS_TIME (corrupts the shared, non-namespaced node wall clock), SYS_RAWIO, BPF, SYS_RESOURCE, and MAC_ADMIN all passed through, and the OCI default DAC_OVERRIDE reached containers regardless.
Any Environment/Function/Container spec that adds capabilities beyond NET_BIND_SERVICE is now rejected at admission.
Tenants that silently relied on the OCI default cap set will also see those caps stripped at merge time.
If a workload legitimately needed one of the dropped capabilities, run it outside Fission’s tenant-facing CRDs.
HTTPTrigger / TimeTrigger / CanaryConfig admission webhooks are gone
CRD field rules for these three resources now run as CEL x-kubernetes-validations in the API server itself.
The dedicated admission webhooks are removed; the checks CEL cannot express (invalid cron schedules, CORS origin/max-age, ingress path regex) move into the router and timer reconcilers and surface as resource status conditions (RouteAdmitted=False, Scheduled=False) instead of admission rejections.
Behavioral change: a raw kubectl apply / GitOps write of an invalid cron, CORS origin, or ingress path is now admitted and flagged with a …=False condition rather than rejected at creation.
The fission CLI still rejects these client-side, so the common path is unchanged.
Operators relying on admission rejection of a malformed cron/CORS/ingress should check the new resource conditions instead.
The Function, Environment, Package, MessageQueueTrigger, and KubernetesWatchTrigger webhooks are retained — they enforce the checks CEL cannot express (cross-namespace references, PodSpec/container security, package archive size, message-queue type/topic validity).
Deprecations/Removals
- Kubernetes < 1.32 is no longer supported (
kubeVersion: ">=1.32.0-0"in the Helm chart). logger.podSecurityPolicyHelm values and the fluentbit PodSecurityPolicy manifest are removed (PSP was removed in Kubernetes 1.25).- Admission webhooks for HTTPTrigger, TimeTrigger, and CanaryConfig are dropped — replaced by API-server CEL plus reconciler-written status conditions.
- The
fission-bundlebinary no longer pulls ingithub.com/aws/aws-sdk-gov1 orgithub.com/graymeta/stow:pkg/storagesvcnow usesminio-go/v7directly. ~3 MB of always-resident heap per subsystem is reclaimed; behaviour and Package URL?id=…formats are preserved.
Highlights
- Reconciler consolidation in the executor (RFC-0004, Implemented).
The executor’s nine controller-runtime reconcilers collapse to three.
A single Function-centric reconciler uses
.For(Function)plus.Watches(...)to react to its real dependency graph — Environment, ConfigMap, Secret, and the Deployment/Service/HPA it manages — instead of three per-executor-type Function reconcilers, two Environment reconcilers, and two ConfigMap/Secret reconcilers each running their own predicates and goroutine pools. The redundant standalone Deployment/Service informer factory is retired (one informer infrastructure removed from every executor process), and reads route through the manager cache. No CRD, CLI, or Helm-value change. - Self-healing function workloads (RFC-0004).
Because owner-reference garbage collection cannot cross namespaces — and Fission’s Deployment/Service/HPA often live in a
FunctionNamespacedistinct from the Function CR — the reconciler now.Watches()its managed objects via the existing function-identifying labels. A Deployment deleted out-of-band re-enqueues the owning Function and is recreated proactively (targetingMinScale), soMinScale>0workloads stay warm instead of waiting for the next invocation. The request-path heal (GetFuncSvc → IsValid → re-specialize) remains the backstop; the periodic reaper becomes a long-tail backstop rather than the primary repair path. - Reliable cross-namespace function teardown (RFC-0004).
A new
fission.io/function-cleanupfinalizer on the shared Function reconciler tears workloads down via the owning executor type’sDeleteFunctionbefore the Function CR is collected, closing the long-standing leak where the executor could miss a delete event and orphan cross-namespace workloads. Gated by the chart-widefinalizerEnabledtoggle (default on); flipping it off drains any existing finalizer safely, and the deletion teardown path is flag-independent so toggling never strands an object. - CRD field rules now run in the API server (RFC-0003).
Validation that previously lived only in the admission webhook moves into
x-kubernetes-validations(CEL) markers on the CRD types — executor type/strategy enums, autoscale bounds, archive/checksum/build-status enums, environment version range, pool-size and termination-grace bounds, HTTPTrigger HTTP-method enums, FunctionReference name (DNS-1123) validation, andEnvironment.spec.versionimmutability. The same change adds Server-Side Apply list markers, sokubectl apply --server-side, Argo CD, and Flux merge Fission resources without clobbering peer entries. Rules apply even when the admission webhook isn’t running, andkubectl explainsurfaces them. The webhook is retained for the checks CEL cannot express (cross-namespace references, podspec/container security, the package archive literal-size limit) — see the related webhook removal under Deprecations. - Control-plane robustness for HA installs.
Router, executor, and the singleton controllers (kubewatcher, timer, mqtrigger Kafka, mqt_keda, buildermgr, canaryconfigmgr) gain graceful-drain on shutdown via a fresh
GRACEFUL_SHUTDOWN_TIMEOUTcontext, opt-in active-passive leader election via KubernetesLease(only the leader runs mutating controllers/reapers; standbys keep warm caches),/readyzgated on informer-cache sync (and leadership where applicable) so non-leaders stay out of Service endpoints, jittered retry backoff plus panic-recovery middleware on both router listeners, and removal ofos.Exit(1)from router/executor/KEDA-scaler hot paths (degrade-and-retry instead of crashloop). Default single-replica installs behave identically — whenleaderElection.enabled: false, the helper reports leader immediately. - Helm chart additions for HA.
Templated
replicasand opt-inleaderElection.enabledfor every leader-electable controller; probe split (/readyzreadiness,/healthzliveness); opt-in rolloutstrategyandterminationGracePeriodSeconds; opt-inpodDisruptionBudgetfor router and executor; opt-inautoscaling(HorizontalPodAutoscaler) for the router;coordination.k8s.io/leasesRBAC wired into every leader-electable role. All defaults preserve the prior 9-pod, single-replica topology. - Memory and metrics wins.
Latency summaries (
http_requests_duration_secondsand friends) are converted to Prometheus histograms; executor pod-cache maps are freed and the idle reaper’s concurrency is bounded; the zap base logger is memoized and webhook loggers are lazy-init. Composed with theaws-sdk-gov1 /graymeta/stowdrop inpkg/storagesvc, ~6 MB of always-resident heap is reclaimed in everyfission-bundlesubsystem (pprof-confirmed against the kind-CI run).
Fixes
The five recurring runtime-error classes surfaced by CI log analysis are addressed under RFC-0006 — Runtime Error-Noise Reduction & Pod-Lifecycle Correctness (Implemented in #3468–#3473):
exec /bin/sleeppreStop hooks are gone. Thelifecycle.preStophook on executor- and fetcher-managed pods used to invoke/bin/sleepviaexec, which (a) fails on distroless images likechainguard/staticthat have no shell orsleep, (b) always exits 137 because the sleep runs the full grace window, and (c) is a wasted CRI round-trip at grace=0. The hook now uses Kubernetes’ nativelifecycle.preStop.sleepaction (GA since 1.30; our floor is 1.32) — the kubelet performs the sleep itself with no binary required, and grace=0 emits no hook at all.- Trigger events on transient router 404s are no longer dropped.
pkg/publisher(kubewatcher / timer / mqtrigger) treated every 4xx from the router’s internal listener as terminal, so events delivered during the window between trigger creation and mux reconciliation were silently dropped. 404 now falls through to the existing bounded retry (10× / ~17 min worst case); other 4xx and 5xx semantics are unchanged. CI now provisions metrics-server in the kind cluster so HPA scaling is actually exercised end-to-end for the first time. - HPA actually scales newdeploy / container functions.
Pod-wide
ResourceCPU/memory metrics require every container in the pod to declare a request, which Fission function pods rarely satisfy (function container has no CPU request unless the user or environment sets one, sidecars can lack them too). KCM loggedmissing request for cpuand HPA scaling silently never worked. The executor now rewrites those toContainerResourcemetrics scoped to the function’s main container (GA since Kubernetes 1.30) and reconciles drift on existing HPAs, both on create and onfn update. - Deletion races no longer log as errors.
Pool destroy on an already-deleted Deployment,
fsvc not found in cacheon function delete,setInitialBuildStatuson a deleted Package, and router tap of an expired fsvc are now NotFound-tolerant and stay out of the error log. - Specialize-vs-delete race and finalizer write races closed in the executor, and Function status writes retry on
Conflictto absorb concurrent reconciles. - Zip-slip in archive extraction is closed (CWE-22).
pkg/utilszip extraction and the fetcher/builder shared-volume FS operations are confined withos.Roothelpers. - AdoptExistingResources hardened.
The executor now has
updateon services in its RBAC, and the adopt/cleanup path is resilient to concurrent re-stamping with serial-adopt test coverage. - Routine security sweeps refreshed Go dependencies (Kubernetes 0.36, controller-runtime 0.24, KEDA 2.20, OTel, sarama, minio-go, x-image) and bumped integration-test kind images to v1.32.11 / v1.34.3 / v1.35.1.
- Helm chart published as 1.25.0, versioned independently from the app version.
Changelog
What’s Changed
- ci(security): tighten GITHUB_TOKEN permissions and pin dashboard-linter by @sanketsudake in https://github.com/fission/fission/pull/3407
- chore(claude): add issue-pr-scrub backlog triage skill by @sanketsudake in https://github.com/fission/fission/pull/3408
- fix(executor): stop updateCPUUtilizationSvc goroutine on pool destroy by @sanketsudake in https://github.com/fission/fission/pull/3410
- fix(executor): apply executor-type label filter on informers (#2775) by @sanketsudake in https://github.com/fission/fission/pull/3411
- perf(metrics): use a histogram for http_requests_duration_seconds by @sanketsudake in https://github.com/fission/fission/pull/3412
- perf(executor): free pod cache maps and bound idle-reaper concurrency by @sanketsudake in https://github.com/fission/fission/pull/3413
- perf(metrics): convert remaining latency summaries to histograms by @sanketsudake in https://github.com/fission/fission/pull/3414
- feat(observability): memory-leak detection harness for router/executor by @sanketsudake in https://github.com/fission/fission/pull/3409
- docs(skill): capture pprof profile analysis workflow in debug-github-ci by @sanketsudake in https://github.com/fission/fission/pull/3415
- feat: RFC-0005 control-plane scalability, robustness & lifecycle by @sanketsudake in https://github.com/fission/fission/pull/3416
- feat: RFC-0005 WS3 — buildermgr on controller-runtime Manager by @sanketsudake in https://github.com/fission/fission/pull/3417
- feat: RFC-0005 WS3 — executor on controller-runtime Manager by @sanketsudake in https://github.com/fission/fission/pull/3418
- feat: RFC-0005 WS3 — router on controller-runtime Manager by @sanketsudake in https://github.com/fission/fission/pull/3419
- feat: RFC-0005 — native leader election for trigger subsystems; remove leaderelection helper by @sanketsudake in https://github.com/fission/fission/pull/3420
- refactor: RFC-0005 — replace GroupManager with errgroup; remove pkg/utils/manager by @sanketsudake in https://github.com/fission/fission/pull/3421
- feat: RFC-0005 WS3 — Reconciler scaffold + timer reconciler by @sanketsudake in https://github.com/fission/fission/pull/3422
- feat: RFC-0005 WS3 — kubewatcher reconciler by @sanketsudake in https://github.com/fission/fission/pull/3423
- refactor: RFC-0005 WS3 — canaryconfigmgr as a controller-runtime Reconciler by @sanketsudake in https://github.com/fission/fission/pull/3424
- feat: RFC-0005 WS3 — mqtrigger as a controller-runtime Reconciler by @sanketsudake in https://github.com/fission/fission/pull/3425
- feat: RFC-0005 WS3 (D1) — shared executor idle reaper by @sanketsudake in https://github.com/fission/fission/pull/3426
- feat: RFC-0005 WS3 (B1) — Package /status subresource + split status writers by @sanketsudake in https://github.com/fission/fission/pull/3427
- feat: RFC-0005 WS3 (B2) — buildermgr Environment + Package reconcilers by @sanketsudake in https://github.com/fission/fission/pull/3428
- feat: RFC-0005 WS3 (C) — router HTTPTrigger + Function reconcilers by @sanketsudake in https://github.com/fission/fission/pull/3429
- feat: RFC-0005 WS3 (D/cms) — executor ConfigMap/Secret reconcilers by @sanketsudake in https://github.com/fission/fission/pull/3430
- feat: RFC-0005 WS3 (D/container) — container executor Function reconciler by @sanketsudake in https://github.com/fission/fission/pull/3431
- feat: RFC-0005 WS3 (D/poolmgr) — pool manager Function reconciler by @sanketsudake in https://github.com/fission/fission/pull/3432
- feat: RFC-0005 WS3 (D/newdeploy) — newdeploy Function + Environment reconcilers by @sanketsudake in https://github.com/fission/fission/pull/3433
- feat: RFC-0005 WS3 (D/poolmgr-lifecycle) — poolmgr Environment + Function reconcilers by @sanketsudake in https://github.com/fission/fission/pull/3434
- feat: RFC-0005 WS3 (D/poolmgr-rs) — poolmgr ReplicaSet reconciler by @sanketsudake in https://github.com/fission/fission/pull/3435
- feat: RFC-0005 WS3 (D/poolmgr-readypod) — readyPod reconciler; retire gpmInformerFactory by @sanketsudake in https://github.com/fission/fission/pull/3436
- ci: bump integration-test kind k8s versions to v1.32.11 / v1.34.3 / v1.35.1 by @sanketsudake in https://github.com/fission/fission/pull/3437
- refactor: retire finformerFactory + fix TestGoEnv router resolver staleness by @sanketsudake in https://github.com/fission/fission/pull/3438
- logger: migrate –logger pod watch to a controller-runtime reconciler by @sanketsudake in https://github.com/fission/fission/pull/3440
- mqtrigger: migrate –mqt_keda scaler to a controller-runtime reconciler by @sanketsudake in https://github.com/fission/fission/pull/3439
- test: deepen executor updateFunction + fission CLI integration coverage by @sanketsudake in https://github.com/fission/fission/pull/3441
- perf(logger): memoize base zap logger and lazy-init webhook loggers by @sanketsudake in https://github.com/fission/fission/pull/3442
- perf(storagesvc): replace graymeta/stow with minio-go + os to drop aws-sdk-go v1 by @sanketsudake in https://github.com/fission/fission/pull/3443
- fix(utils): confine zip extraction with os.Root (zip-slip / CWE-22) by @sanketsudake in https://github.com/fission/fission/pull/3444
- refactor(fetcher,builder): confine shared-volume FS ops with os.Root helpers by @sanketsudake in https://github.com/fission/fission/pull/3445
- chore(utils): remove unused functions from pkg/utils by @sanketsudake in https://github.com/fission/fission/pull/3446
- chore: raise minimum supported Kubernetes to 1.32 by @sanketsudake in https://github.com/fission/fission/pull/3448
- fix(rbac): grant executor
updateon services for AdoptExistingResources + serial adopt coverage by @sanketsudake in https://github.com/fission/fission/pull/3447 - fix(executor): make AdoptExistingResources resilient to concurrent re-stamping by @sanketsudake in https://github.com/fission/fission/pull/3450
- feat(crd): validate CRDs in the API server with CEL + server-side-apply markers by @sanketsudake in https://github.com/fission/fission/pull/3449
- refactor(executor): context-bound adopt/cleanup wait + de-duplicate adopt/cleanup by @sanketsudake in https://github.com/fission/fission/pull/3451
- refactor(webhook): drop the HTTPTrigger, TimeTrigger and CanaryConfig admission webhooks by @sanketsudake in https://github.com/fission/fission/pull/3452
- refactor(validation): split CRD validation into CEL-covered and webhook-only halves by @sanketsudake in https://github.com/fission/fission/pull/3454
- refactor(executor): share deployment helpers across newdeploy/container + cleanup by @sanketsudake in https://github.com/fission/fission/pull/3453
- test(fission-cli): raise coverage + dedupe create commands; fix canary status bug by @sanketsudake in https://github.com/fission/fission/pull/3455
- Security dep sweep: k8s 0.36 / controller-runtime 0.24 / KEDA 2.20 + OTel/sarama/minio/x-image by @sanketsudake in https://github.com/fission/fission/pull/3456
- refactor(executor): share one Environment reconciler across executor types by @sanketsudake in https://github.com/fission/fission/pull/3457
- refactor(executor): share one Function reconciler across executor types by @sanketsudake in https://github.com/fission/fission/pull/3458
- refactor(executor): serve newdeploy/container IsValid from the Manager cache (retire standalone informer factory) by @sanketsudake in https://github.com/fission/fission/pull/3459
- feat(executor): opt-in cleanup finalizer for reliable cross-namespace teardown by @sanketsudake in https://github.com/fission/fission/pull/3460
- feat(executor): self-heal newdeploy/container function workloads on drift by @sanketsudake in https://github.com/fission/fission/pull/3461
- Bump the docker-images group across 5 directories with 1 update by @dependabot[bot] in https://github.com/fission/fission/pull/3311
- chore(deps): bump the github-actions group with 3 updates by @dependabot[bot] in https://github.com/fission/fission/pull/3463
- chore(deps): bump github.com/prometheus/common from 0.68.0 to 0.68.1 in the go-dependencies group by @dependabot[bot] in https://github.com/fission/fission/pull/3462
- fix(httptrigger): enforce path-safety at admission (GHSA-vchh) by @sanketsudake in https://github.com/fission/fission/pull/3464
- fix(podspec): allowlist tenant caps + force drop:[ALL] (GHSA-qf5v) by @sanketsudake in https://github.com/fission/fission/pull/3465
- chore: Go 1.26.4, kind v0.32.0, release version 1.25.0 by @sanketsudake in https://github.com/fission/fission/pull/3467
- fix(executor,fetcher): use native sleep preStop hook instead of exec /bin/sleep by @sanketsudake in https://github.com/fission/fission/pull/3468
- fix: tolerate deletion races and stop logging routine churn as errors by @sanketsudake in https://github.com/fission/fission/pull/3469
- fix(publisher): retry transient router 404s; ci: metrics-server for HPA by @sanketsudake in https://github.com/fission/fission/pull/3470
- fix(test): retry Function status writes on conflict by @sanketsudake in https://github.com/fission/fission/pull/3471
- fix(executor): close specialize-vs-delete race; tolerate finalizer write races by @sanketsudake in https://github.com/fission/fission/pull/3472
- fix(executor): scope HPA resource metrics to the function container by @sanketsudake in https://github.com/fission/fission/pull/3473
- chore: sync generated code with types.go doc comments by @sanketsudake in https://github.com/fission/fission/pull/3474
Full Changelog: https://github.com/fission/fission/compare/v1.24.0...v1.25.0