Commit Graph

127 Commits

Author SHA1 Message Date
vasyansk 5215678fe6 Merge fix/surface-provider-error: реальная ошибка провайдера вместо internal error
apply/check при провайдерской ошибке возвращают реальный ответ Selectel (502)
вместо generic internal error; внутренние ошибки остаются generic 500.
Секрет не течёт (токен не в path, auth-сообщения статические).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 16:10:37 +07:00
vasyansk 879e9e14b1 fix(api): surface real provider error on apply/check instead of generic internal error
resolve (shared by Check/Apply) and Apply now wrap GetRecords/ApplyChanges
failures in service.ErrProviderUnavailable, matching ZoneRecords' existing
behavior. handleApply/handleCheck use errors.Is against it to return 502
with the real provider message (e.g. Selectel's 409 conflict body) instead
of masking every failure as a generic 500 "internal error"; non-provider
errors (decrypt/db/loader) are unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 15:53:27 +07:00
vasyansk 6f9958af60 Merge feature/selective-apply-order: выборочный Apply + удаления перед обновлениями
Пер-записевые чекбоксы (updates/prunes по ключам, select-all, prune opt-in);
удаления применяются ДО обновлений — Selectel больше не отвергает конфликт
CNAME/A на одном имени. Порядок deletes-first задокументирован как инвариант.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 15:32:54 +07:00
vasyansk e283e5f22a fix: document apply ordering invariant; visible indeterminate checkbox + test
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 15:26:46 +07:00
vasyansk 2f1f5311ad feat(web): per-record apply checkboxes with select-all; prune opt-in
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-05 15:20:09 +07:00
vasyansk 0b26923586 feat(apply): per-record selection + deletes-before-updates ordering
RecordDiff.Key() gives a stable normalized identifier ("TYPE name.") for
every diff kind, exposed as recordView.Key. ApplyRequest now takes
Updates/Prunes key lists instead of two booleans, so callers can apply a
subset of records. service.Apply builds the applied set with selected
prunes (Delete) added before selected updates (Add/Update) — an
invariant, not an option — since the provider rejects an Add/Update
whose name still conflicts with an existing record (e.g. a CNAME cannot
be created while an A on the same name still exists).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 15:10:01 +07:00
vasyansk fc19678727 docs: plan for per-record apply selection + deletes-first order
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 15:03:14 +07:00
vasyansk 0b2b9c6e3e Merge fix/check-status-and-diff-layout: статус при ручной проверке + перенос длинных значений
(A) Ручной check персистит last_check_status (был вечный unknown до планировщика);
SetDomainStatus скоуплен по project_id (закрыт IDOR на запись).
(B) Длинные значения (DKIM/SPF) переносятся в diff, убран горизонтальный оверфлоу.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 14:48:23 +07:00
vasyansk 27d70a987e fix(store): scope SetDomainStatus by project (IDOR); scheduler reuses DeriveStatus
handleCheck's error branch wrote last_check_status via an id-only UPDATE, so
an authenticated caller's own valid project id paired with a foreign domain
id in the URL could flip a stranger's domain to "error" even though Check
itself is project-scoped and would 404/error out first. Add project_id to
the WHERE clause (queries/domains.sql + generated db/domains.sql.go), thread
projectID through Store/TenantStore/SchedStore SetDomainStatus, and pass pid
from context at both call sites in handleCheck plus the scheduler.

Also collapse checkDomain's inline status derivation in scheduler.go into a
call to service.DeriveStatus, the same helper handleCheck already uses, so
there's a single source of truth for "drift vs in_sync" instead of two
copies that could drift apart.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 14:40:13 +07:00
vasyansk 784e7bd822 fix(web): wrap long record values in diff and zone view (no horizontal overflow)
RecordRow now splits into a top line (badge/name/read-only, unaffected
by value length) and a plain block-level values line below it, so a
~400-char unbreakable DKIM key wraps via break-all instead of stretching
the flex row and forcing page-wide horizontal scroll. Zone records table
gets break-all on the values cell too.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 14:27:33 +07:00
vasyansk 1b367c4bda fix(api): manual check persists last_check_status (was stale unknown)
Manual domain checks (Recheck button / diff page load) never wrote
domains.last_check_status - only the scheduler did, leaving a
newly-templated domain stuck at "unknown" until the next scheduled run.

Extract status derivation into internal/service (single source of truth):
StatusUnknown/InSync/Drift/Error constants and DeriveStatus(diff.Changeset).
The scheduler now aliases these constants instead of duplicating them.
handleCheck persists the derived status (or StatusError on failure) via
TenantStore.SetDomainStatus after every manual check - status/history only,
no notification, which remains the scheduler's job.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 14:22:02 +07:00
vasyansk cc5e562a67 docs: plan for manual-check status + diff value wrapping
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 14:15:02 +07:00
vasyansk aa7e8c705a Merge feature/template-placeholders: плейсхолдер {{domain_name}} в шаблонах
Шаблон хранит записи с {{domain_name}}, материализуется под имя зоны при
diff/apply (переиспользуем между доменами); snapshot зоны авто-параметризует
(матчинг по границе DNS-лейбла); подсказка в редакторе.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 14:01:34 +07:00
vasyansk 91f7a02f2c fix(tmpl): parameterize zone name only on DNS-label boundaries
Parameterize used strings.ReplaceAll, so an external host that merely
ends with the zone name as a substring (e.g. "notreconops.ru." against
zone "reconops.ru") was falsely rewritten to "not{{domain_name}}.".
Replace only where the zone name sits on a DNS-label boundary (start/
end of string or a non-alphanumeric/hyphen character), and resolve to
a fixed point so adjacent occurrences sharing a single boundary
character are still both replaced.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-05 13:58:55 +07:00
vasyansk 655ae8ccf8 feat(web): hint about {{domain_name}} placeholder in template editor
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-05 13:48:24 +07:00
vasyansk df895d8850 feat(tmpl): {{domain_name}} placeholder — materialize on diff/apply, parameterize on snapshot
Adds internal/tmpl with Materialize (template placeholder -> zone name) and
Parameterize (zone name -> placeholder, the inverse used by the
template-from-zone snapshot). service.resolve now materializes the template
against DomainRef.ZoneName before diffing, so one template can be reused
across domains. LoadDomainFull (source query + hand-edited sqlc output, since
sqlc is not installed) now also selects zone_name to populate it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 13:41:18 +07:00
vasyansk 135917216c docs: plan for {{domain_name}} template placeholders
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 13:34:00 +07:00
vasyansk 40cec05c9a Merge fix/empty-changeset-null: пустой changeset → [] не null (белый экран после snapshot)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 13:11:01 +07:00
vasyansk bc2e77ad4e fix: empty changeset must serialize as [] not null (white-screen after snapshot)
toChangesetResponse initialises updates/prunes/readOnly so a zone matching
its template exactly (e.g. right after 'create template from zone') marshals
arrays, not null. DiffView/DomainDiffPage also normalise null defensively.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 13:10:08 +07:00
vasyansk 08697e06d7 Merge feature/zone-view-snapshot: просмотр зоны без шаблона + snapshot-шаблон
Просмотр записей зоны без привязанного шаблона (read-only), создание
шаблона-снимка из текущего состояния зоны (managed-only, авто-привязка),
статус «без шаблона» вместо unknown, убрана кнопка удаления домена.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 12:58:20 +07:00
vasyansk 9f0938daea fix: reject snapshot when template already attached (409); handle domains-load error; drop orphaned useDeleteDomain
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-05 12:54:52 +07:00
vasyansk 137113cbe6 fix(web): gate zone-records fetch to no-template case; wait for domains load before branching
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-05 12:28:07 +07:00
vasyansk c2832348f8 feat(web): view zone without template, snapshot button, no-template status, drop delete
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 12:19:50 +07:00
vasyansk 5662334799 fix(api): distinguish domain-not-found (404) from provider failure (502) on zone endpoints
Introduce service.ErrProviderUnavailable, wrapped only around the
provider GetRecords call in ZoneRecords. handleZoneRecords and
handleTemplateFromZone now use errors.Is against it to tell a real
provider outage (502) apart from local resolution failures such as an
unknown domain (404), instead of collapsing every ZoneRecords error
into a blanket 502. Also fixes handleTemplateFromZone's GetDomain
error branch to return 404 "domain not found" instead of 500, for
consistency with handleSetDomainTemplate/handleDomainHistory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 12:14:46 +07:00
vasyansk 9ccb304d2e feat(api): read zone records without template + snapshot-to-template
LoadDomain requires a template, so a zone without one could never be
viewed or snapshotted. Adds a template-free path: store.LoadZone /
service.ZoneRef / DomainService.ZoneRecords read a zone's live records
straight from the provider (no diff, no template). GET
/domains/{did}/records exposes read-only viewing; POST
/domains/{did}/template-from-zone snapshots only managed record types
(NS/SOA excluded) into a new template and auto-attaches it to the domain.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 12:00:27 +07:00
vasyansk 1540140542 docs: plan for zone view without template + snapshot
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-05 11:48:21 +07:00
vasyansk b4e34f5b9b Merge feature/selectel-iam-auth: фикс 401 — project IAM-токен для Cloud DNS v2
Приложение получает 24ч IAM-токен из учётки сервисного пользователя (Identity API),
кэширует и обновляет; учётка = зашифрованный JSON {username,password,account_id,project_name};
валидация кредов пробным логином при добавлении; форма из 4 полей.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 21:25:39 +07:00
vasyansk e8e7371f09 fix: drain Identity error body (keep-alive); reject whitespace-only credential fields in form
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-04 20:36:50 +07:00
vasyansk be408a216c feat(web): Selectel service-user account form (IAM credentials)
Replace the single API-key field with 4 IAM service-user fields
(username, password, account_id, project_name) matching the new
backend contract; map 400 "invalid provider credentials" to a
user-facing message.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 20:23:34 +07:00
vasyansk 568452846a feat(api): structured provider credentials + trial-auth validation on account create
POST /accounts now accepts secret as a provider-specific JSON object
instead of an opaque string, and validates credentials via
provider.Provider.Validate before persisting — invalid credentials get
a generic 400 without ever reaching Store.CreateAccount or echoing the
secret back.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 20:12:41 +07:00
vasyansk 32107571d1 feat(selectel): project-scoped IAM auth with token cache; provider Validate
Selectel Cloud DNS v2 requires a project IAM token in X-Auth-Token, not the
raw service-user secret; the previous client sent the static secret directly
and got 401. The client now parses Credentials.Secret as a Creds JSON blob
(username/password/account_id/project_name), exchanges it for a token via
the Identity API (POST /identity/v3/auth/tokens), and caches the token in
memory per-account until 5 minutes before expiry. ListZones/GetRecords/
ApplyChanges send the cached IAM token instead of the raw secret.

provider.Provider gains a Validate(ctx, Credentials) method so a bad account
can be rejected via trial login at creation time; all Provider fakes across
provider/registry/api/service test packages implement it as a no-op stub for
now (Task 2 will make api's mock configurable).

Security: the service-user password is folded into the token cache key via
SHA-256 (never stored in the clear) so a password change invalidates the
cached token; identity errors are generic and never echo the request body.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 20:02:36 +07:00
vasyansk 617b02dbfb docs: plan for Selectel IAM auth (Cloud DNS v2)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 19:52:42 +07:00
vasyansk 774b480677 Merge feature/tech-debt-docker: техдолг Фаз 1-3 + docker compose
T1 graceful scheduler shutdown + /healthz + healthcheck-режим;
T2 per-channel notification metrics; T3 frontend code-splitting + allowlist;
T4 multi-stage Dockerfile (distroless nonroot); T5 docker compose (app+postgres).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 16:38:45 +07:00
vasyansk 77ca0200ae build: docker compose (app + postgres) with healthchecks and .env
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 16:27:16 +07:00
vasyansk 675136e488 build: mirror .gitignore dist rules in .dockerignore for hermetic builds
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 16:23:20 +07:00
vasyansk 7d875ea19a build: multi-stage Dockerfile (node build -> go embed -> distroless)
Three-stage image: node:22-alpine builds the Vite SPA, golang:1.26.4-alpine
compiles the server with the built SPA copied into the //go:embed path
before build, distroless/static-debian12:nonroot runs the static binary
as non-root on :8080. .dockerignore keeps node_modules/dist/docs/git out
of the build context while preserving the internal/web/dist/index.html
placeholder needed for a valid embed target pre-COPY.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 16:15:55 +07:00
vasyansk 7256adf637 fix(web): scope Suspense to page body; guard formatConfig against null config
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-04 16:12:21 +07:00
vasyansk 8c35aed8f2 perf(web): route-level code-splitting; harden channel config rendering
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 16:04:17 +07:00
vasyansk 41844d49a0 test(notify): assert per-channel results on decrypt-fail and unknown-type
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-04 16:01:14 +07:00
vasyansk f14916396c feat(notify): per-channel delivery results + accurate notification metrics
Dispatcher.Send now returns []ChannelResult{Type, Err} alongside the
aggregated error, and scheduler.checkDomain increments
NotificationsTotal per channel type/status instead of a single
unconditional IncNotification("dispatch", newStatus) placeholder that
ignored per-channel delivery outcome.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 15:56:15 +07:00
vasyansk e9a100ab4a fix(server): drain scheduler on unexpected serve error before exit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-04 15:52:40 +07:00
vasyansk a27ddc79e8 feat(server): graceful scheduler shutdown, /healthz, healthcheck mode
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 15:46:56 +07:00
vasyansk c265d36bdb docs: plan for tech-debt cleanup + docker compose
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 15:43:00 +07:00
vasyansk f80d700a83 Merge feature/phase-3: scheduler, notifications (Telegram/Webhook), Prometheus metrics
Планировщик периодических проверок (read-only), уведомления по смене статуса
(bot_token шифруется, SSRF-guard с пиннингом IP + CGNAT), Prometheus /metrics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 15:16:58 +07:00
vasyansk 504c4c081f fix(phase3): skip templateless domains in scheduler; block CGNAT range in webhook SSRF guard
Domains imported without a template (TemplateID == nil) are a valid,
unconfigured state, not a failure — RunOnce now skips them before
calling checkDomain instead of letting LoadDomain's "no template" error
turn into StatusError and a spammy unknown->error notification.

isBlockedIP now also rejects 100.64.0.0/10 (RFC 6598 carrier-grade
NAT), which net.IP.IsPrivate() does not cover, closing an SSRF gap in
the webhook destination guard (both the pre-request check and the
per-dial check use isBlockedIP).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 14:58:09 +07:00
vasyansk 34422420ca feat(web): расписание, каналы уведомлений, история проверок, drift-badge 2026-07-04 14:40:29 +07:00
vasyansk 45259b9720 feat(web,api): клиент/хуки расписания/каналов/истории + lastCheckStatus в domainResponse
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BwxdSt4reTm7Dj1oxRvpP3
2026-07-04 14:24:02 +07:00
vasyansk b31f886ae2 feat(server): запуск планировщика, /metrics, graceful shutdown 2026-07-04 14:14:00 +07:00
vasyansk 9475af441e fix(scheduler): убрать двойной SaveCheckRun (Checker персистит), SetDrift через CountDriftDomains, resolved после error 2026-07-04 14:03:49 +07:00
vasyansk 23e02d6804 feat(scheduler): in-process планировщик проверок + смена статуса + уведомления + метрики 2026-07-04 13:53:06 +07:00