--- name: monitoring-and-performance description: Configure and verify observability for tibi website projects. Covers OpenAPI exposure, Prometheus metrics, Sentry wiring, health/reachability checks, and the operator-facing validation that should exist before a project is considered production-ready. --- # monitoring-and-performance ## When to use this skill Use this skill when: - a project needs operator-facing visibility beyond “the page loads” - you need OpenAPI output for integrations or documentation - you need Prometheus/Grafana visibility - you need Sentry or similar error visibility in frontend or server flows - you want to define the minimum health and observability checks for deploys ## Goal Give later agents a concrete workflow for deciding what should be observable, how it is exposed, and how to verify that exposure. This skill is not about arbitrary performance tuning. It is about making the running system inspectable enough that operators and developers can see whether it is healthy. ## Source of truth Use these sources when implementing or reviewing observability: - `tibi-server/docs/13-openapi-metrics.md` - `tibi-server/docs/02-configuration.md` - `.agents/skills/deployment/SKILL.md` - `frontend/src/config.ts` - relevant deploy scripts and env/config files ## OpenAPI exposure Tibi-server generates an OpenAPI spec per project: ```text GET /api/v1/{namespace}/openapi ``` Use this when: - a project exposes public API surfaces that need documentation - integrations or client generators benefit from a machine-readable contract Collection-level OpenAPI customization lives in collection metadata via `meta.openapi`. Use that metadata deliberately to: - hide endpoints that should not appear in the spec - add summaries and descriptions - keep the public API contract readable ## Metrics exposure Prometheus metrics are exposed at: ```text GET /metrics ``` Key upstream metric documented today: - `tibi_request_duration_seconds` This is useful for: - request latency visibility - collection-level timing comparisons - basic traffic and error observation in Grafana/Prometheus Do not enable metrics-like operator expectations in a project and then forget to verify the endpoint actually works in the target environment. ## Sentry and error visibility This stack can surface errors through Sentry-related configuration. Relevant surfaces include: - server-level `sentry` config in tibi-server - frontend runtime wiring in `frontend/src/config.ts` - deploy-time release/build metadata injection where the project uses it Use Sentry deliberately: - define DSN, environment, and release expectations - know whether tracing is wanted or only error capture - make sure deploy scripts and build metadata agree with the runtime setup Do not leave a half-configured Sentry setup that looks present but produces unusable traces. ## Health and reachability checks At minimum, operators should be able to verify: - website URL responds - admin URL responds - API responds - OpenAPI and metrics endpoints respond when they are intended to be used In this repo family, simple reachability probes are often the first useful health signal. For project delivery, these checks belong next to deploy and sign-off work, not only in ad-hoc troubleshooting. ## Recommended patterns ### Public API projects Recommended shape: - expose OpenAPI intentionally - add `meta.openapi` summaries for meaningful endpoints - verify the spec against the current collection model ### Operated production projects Recommended shape: - metrics endpoint reachable in the target environment - at least one documented Grafana/Prometheus use case for request timing - explicit decision whether Sentry is used or intentionally not used ### Basic website deployments Recommended shape: - website/admin/API reachability checks are part of deploy verification - observability is documented enough that later operators know what exists ## Anti-patterns - treating observability as optional once the build passes - exposing OpenAPI or metrics accidentally without deciding who uses them - half-configured Sentry with no useful environment or release handling - relying on manual browser clicks as the only production health check ## Verification checklist After observability-related work, verify all of these: 1. intended OpenAPI exposure works and reflects the current collection config 2. intended metrics exposure works in the target environment 3. Sentry/error visibility is either intentionally configured or intentionally absent 4. deploy-time reachability checks cover website, admin, and API 5. `yarn build`, `yarn build:server`, and `yarn validate` still pass when observability wiring touched frontend/server config ## What an LLM should inspect first When asked to set up monitoring or observability on this starter, inspect in this order: 1. `tibi-server/docs/13-openapi-metrics.md` 2. `tibi-server/docs/02-configuration.md` 3. deploy scripts and env/config files 4. `frontend/src/config.ts` 5. whether the project truly needs OpenAPI, metrics, Sentry, or only reachability checks This prevents over-documenting features that are not actually wired and under-documenting the ones that matter operationally.