Newer
Older
smart-home-server / server / docs / review-phase-4-automation-scripts-infrastructure.md

Phase 4 Review: Automation, Scripts, Infrastructure

Date: 2026-06-02 Scope: ControlScripts layer, event routing, cron/background execution, infrastructure (config, logging, DB, CLI), Fury framework services.


Critical (C)

C1. Static script registries leak state across requests

File: server/SHServ/Middleware/ControlScripts.php Lines: 12–13

protected static $regular_scripts = [];
protected static $actions_scripts = [];

static arrays accumulate scripts across all instantiations. In PHP-FPM, statics persist per worker process. Adding/removing scopes or reloading code does not clear the arrays. Tests (ScriptsModelStateTest) use reflection to reset, but production has no cleanup mechanism. If a scope file is deleted, its scripts remain in static memory until worker restart.

Impact: Stale script entries, phantom actions, memory growth, unpredictable behavior after deploys.

Fix: Flush statics at the start of control_scripts_init(), or use instance-level storage instead of static.


C2. No atomic write / backup for scope file updates

File: server/SHServ/Controllers/ScriptsRESTAPIController.php Method: scope_update file_put_contents($filepath, $file) writes directly over the live PHP class file. If the request is interrupted mid-write, the file is truncated/corrupted. PHP will fail to parse it on the next request, breaking the entire scope and potentially the application.

Impact: Broken scope class → fatal parse error → 500 for all subsequent requests using that scope.

Fix: Write to a temp file, rename() atomically, keep the previous version as .bak.


C3. Events post-flush handlers can hang indefinitely

File: server/SHServ/Controllers/EventsController.php Method: new_event After fastcgi_finish_request(), event handlers are dispatched. Handlers like btn_on_onlinesync_btn_channels make synchronous HTTP calls to devices via Button/Relay API. set_time_limit(10) applies to the whole block, but with 5 event dispatches and multiple device calls, 10 seconds may not be enough. If a device is slow or unresponsive, the PHP process is held busy.

Impact: PHP-FPM worker exhaustion, delayed response for other requests, cron/scanning blocked.

Fix: Enforce a strict per-handler timeout (e.g. 2s), run handlers asynchronously via a queue or background job, or use non-blocking HTTP.


C4. RCE via scope_update (reinforced)

File: server/SHServ/Controllers/ScriptsRESTAPIController.php Any authenticated user can overwrite a scope file with arbitrary PHP. The file is immediately included on the next request via control_scripts_init(). Combined with no RBAC, this is trivial remote code execution.

Impact: Full server compromise.

Fix: Sandbox scope code (e.g. restricted DSL), validate with php -l, disable live editing in production, require admin role.


C5. sync-map.json loaded from disk on every scope init, no caching

File: server/ControlScripts/Common.php Method: register_global_device_sync_map Each scope that uses Common re-reads sync-map.json via file_get_contents. With 4 scopes, the file is read 4 times per request (~9 KB each). No opcode cache or in-memory cache.

Impact: Unnecessary disk I/O, parse overhead on every request.

Fix: Load once in App::control_scripts_init() and pass to scopes, or cache decoded array in a static/shared variable.


C6. Cron script execution stops on first failure

File: server/SHServ/Controllers/CronController.php Method: run_regular_cron_scripts

foreach($regular_scripts as $alias => $script) {
    // ...
    $script["script"]();
}

No try/catch. If one regular script throws, the exception bubbles up and the remaining scripts in the loop never execute.

Impact: Partial automation — some scheduled tasks silently skipped.

Fix: Wrap each script invocation in try/catch, log the error, continue the loop.


High (H)

H1. Action scripts have no error handling

File: server/SHServ/Middleware/ControlScripts.php Method: run_action_script

$result = self::$actions_scripts[$alias]["script"]($params);

No try/catch. An exception in an action script kills the API request with a 500 and no meaningful response to the client.

Fix: Wrap in try/catch, return structured error response.


H2. get_source_code() re-reads the source file on every script registration

File: server/SHServ/Middleware/ControlScripts.php Method: get_source_code Uses file() + array_slice to extract closure source code. Called once per add_action_script / add_regular_script. For 20 scripts, the file is read 20 times.

Impact: Redundant disk I/O during every request bootstrap.

Fix: Cache file contents per class, or use ReflectionFunction::getClosureUsedVariables + store source once at build time.


H3. control_scripts_init() instantiates all scopes eagerly

File: server/SHServ/App.php Method: control_scripts_init All .php files in ControlScripts/Scopes/ are scanned, included, and instantiated on every request. There is no lazy loading — even disabled scopes run their constructors and all four register_* methods (though register_actions_scripts / register_regular_scripts are gated by DB state, the constructor still runs).

Impact: Slower bootstrap, unnecessary DB queries, memory overhead.

Fix: Lazy-load scopes when their scripts are actually needed, or cache registration metadata.


H4. No health check / status for cron endpoints

File: server/SHServ/Controllers/CronController.php Both run_regular_cron_scripts() and status_update_scanning() return nothing (implicit null). The HTTP response body is empty, HTTP status is 200 by default. There is no way to tell from the response whether scripts ran, how many succeeded, or if any failed.

Impact: Blind monitoring. External cron scheduler (e.g. systemd timer, crontab) cannot detect failures.

Fix: Return structured JSON with execution summary, error count, duration.


H5. Logging writes to JSON without rotation or size limits

File: server/Fury/Kernel/Logging.php Method: dump Appends session logs to Logs/d.m.Y.log.json. File grows indefinitely. No max size, no rotation, no cleanup. On high-traffic days the file can become several GB.

Impact: Disk exhaustion, performance degradation on log write (lock contention, JSON re-encode of huge array).

Fix: Rotate by size (e.g. 10 MB), archive old logs, or switch to structured line-based logging.


H6. No structured logging / severity / correlation IDs

File: server/Fury/Kernel/Logging.php logging()->set($place, $title, $message) accepts only three strings. No severity levels, no request ID, no timestamps per entry (only per session), no structured fields.

Impact: Hard to debug production issues, impossible to trace a single request through controllers, models, and device calls.

Fix: Add severity, request/correlation ID, timestamp per log line, optional context array.


H7. ErrorHandler may leak debug info if debug flag is on

File: server/Fury/Modules/ErrorHandler/ErrorHandler.php Method: view_fatal_error If FCONF["debug"] is true, the handler renders full file paths, source code excerpts, and stack traces as HTML. The debug flag is controlled by .env DEBUG. With no .env file, it defaults to false, but a misconfiguration or accidental .env edit exposes internals.

Impact: Information disclosure — file paths, code snippets, server internals.

Fix: In production mode, never render HTML errors; always log and return generic "Internal Server Error" JSON.


H8. No DB schema / migrations

File: project-wide There are no .sql migration files, no schema versioning, no install script. The scripts table definition only exists inside tests/ScriptsModelStateTest.php as an inline CREATE TABLE. Production database setup is manual and undocumented.

Impact: Impossible to recreate production DB, risky deploys, schema drift between environments.

Fix: Create server/migrations/ with numbered SQL files and a CLI runner.


H9. .env absent — fallbacks use insecure defaults

File: server/SHServ/config.php If .env is missing, DB credentials default to user => "root", password => "", host => "localhost". Device IP scan range defaults to 192.168.2.2–254.

Impact: Accidental connection to wrong/weak DB, scan of wrong subnet.

Fix: Fail hard if .env is missing in production; do not provide insecure fallbacks.


H10. ControlScripts constructor performs DB check per scope

File: server/SHServ/Middleware/ControlScripts.php Line: 26

if($scope_folder != "ControlScripts" or (new Scripts()) -> script_state("scope", $scope_name)) {

For every non-ControlScripts scope, a new Scripts model is instantiated and a DB query is issued to check if the scope is enabled. With N scopes, N DB round-trips on every request.

Impact: N+1 query problem during every bootstrap.

Fix: Batch-fetch all scope states in a single query in App::control_scripts_init().


H11. sync_map() returns mutable reference to shared static state

File: server/SHServ/Middleware/ControlScripts.php Method: sync_map Returns self::$sync_map_storage. Callers (scopes) can mutate the shared array, affecting other scopes.

Impact: Cross-scope state pollution, hard-to-debug side effects.

Fix: Return a deep copy, or make sync_map_storage immutable after init.


H12. DeviceScanner silently discards scan failures

File: server/SHServ/Tools/DeviceScanner.php Failed IPs are skipped without logging. There is no way to know which devices were expected but not found, or whether a device disappeared from the network.

Impact: Silent failures during device discovery, no audit trail for network changes.

Fix: Log scan results — found, lost, unreachable IPs.


H13. text-msgs.php has empty strings for SHServ errors

File: server/SHServ/text-msgs.php Lines 30–48: device_not_found, unknown_device, error_of_device_auth, device_request_fail, db_error, action_script_not_found, scope_not_found, etc. all have empty string values.

Impact: API returns empty msg fields for many errors, poor UX in clients.

Fix: Fill in meaningful Russian/English messages for all SHServ aliases.


H14. RequiredControlScriptsScope mixes business logic with event handling

File: server/SHServ/RequiredControlScriptsScope.php The online event handler directly mutates device entity state ($device -> device_ip = ...; $device -> update()). This is business logic that belongs in a service or model method, not in an event handler.

Impact: Business rules scattered across event handlers, hard to unit-test, duplication risk.

Fix: Move device online/offline state transition to Devices model or a dedicated service.


Medium (M)

M1. EventsHandlers devmode clutter

File: server/SHServ/EventsHandlers.php All devmode handlers are registered inside an if(FCONF["devmode"]) block at runtime. The code paths are still present and evaluated on every request.

Fix: Use conditional class loading or separate dev-only bootstrap file.


M2. console.php has only one command (get.config)

File: server/console.php The CLI entry point only supports dumping config (with credentials stripped). No commands for: run migrations, run cron manually, flush caches, health check, reset rate limiter, re-scan devices.

Fix: Add a command registry with useful ops commands.


M3. sync-map.json has no schema validation

File: server/ControlScripts/sync-map.json Loaded as raw JSON without validation. No checks that aliases correspond to real devices, that channel indices are within device schema, or that the graph is acyclic.

Impact: Invalid sync map causes silent failures (missing device → continue in helper) or wrong channel indices sent to devices.

Fix: Add a JSON Schema and a validation step at bootstrap (or at build time).


M4. ControlScriptsInterface lacks return type declarations

File: server/SHServ/Implements/ControlScriptsInterface.php Interface methods have no return types, while implementations use : void. This inconsistency weakens type safety.

Fix: Add void return type to interface methods.


M5. No observability metrics for automation layer

Project-wide: no counters for events dispatched, actions executed, cron scripts run/failed, device API call latency, event handler latency.

Fix: Add lightweight metrics (even in-memory counters exposed via a /metrics endpoint) or integrate with Prometheus/OpenTelemetry.


M6. add_event_handler discards handler return values

File: server/SHServ/Middleware/ControlScripts.php Method: add_event_handler The wrapper closure ignores the return value of $handler. There is no way for a handler to signal "stop propagation" or return a result.

Fix: Return the handler result, or support explicit stopPropagation() mechanism.


M7. SpotlightsScope::spotlights_by_time is an empty regular script

File: server/ControlScripts/Scopes/SpotlightsScope.php Line: 64–70 Registered but does nothing. Pollutes the regular scripts list and cron loop.

Fix: Remove or implement the time-based logic.


M8. TestScriptsScope references hardcoded production device

File: server/ControlScripts/Scopes/TestScriptsScope.php Line: 23–26 Alias test_stand_relay is hardcoded. If the device does not exist, the action script will fail at runtime.

Fix: Gate test scopes behind devmode flag, or move to a separate directory excluded from production scan.


M9. EventsModel naming inconsistency

File: server/SHServ/Models/EventsModel.php Methods: global_any_device_event_call, global_device_event_call, channel_device_event_call, alias_device_event_call, channel_alias_device_event_call. Naming pattern is inconsistent (_any_ vs _alias_ placement).

Fix: Standardize naming to {scope}_{target}_event_call.


M10. ControlScripts constructor uses fragile string parsing

File: server/SHServ/Middleware/ControlScripts.php Line: 24

list($scope_folder, $scope_name) = explode("\\", str_replace("\\Scopes", "", str_replace("SHServ", "", static::class)));

Assumes exactly two namespace segments. If namespace depth changes, parsing breaks.

Fix: Use basename(str_replace('\\', '/', static::class)) for name, and dirname() logic for folder.


M11. Scripts::remove_scope has TOCTOU race

File: server/SHServ/Models/Scripts.php Method: remove_scope

if(!file_exists($filepath)) { return false; }
if(!@unlink($filepath)) { return false; }

File can be modified between file_exists and unlink.

Fix: Just attempt unlink and check the result; suppress warning with @.


M12. No request correlation ID for event dispatch chain

When a device sends an event, it triggers multiple handlers, each making device API calls. There is no tracing ID to correlate the original event with downstream actions in logs.

Fix: Generate a correlation_id in EventsController and pass it through all event handlers and device API calls.


M13. Logging JSON file lacks indexing

File: server/Fury/Kernel/Logging.php Searching logs requires loading and parsing the entire JSON file. No grep-friendly line format.

Fix: Use newline-delimited JSON (NDJSON) for append-only, grep-friendly log files.


M14. ControlScripts constructor bypass for RequiredControlScriptsScope

File: server/SHServ/App.php RequiredControlScriptsScope is instantiated directly before scanning the Scopes/ directory. It is not listed in control_scripts_instances (it is in $required_control_scripts_instance). But its scripts are still registered in the same static arrays, mixing required and optional scopes.

Impact: Required scope handlers can be accidentally disabled if static arrays are cleared.

Fix: Separate static storage for required vs optional scopes, or ensure required scope is always registered regardless of DB state.


M15. error_handler in config only covers warnings and fatal errors

File: server/SHServ/config.php

"important_errors" => ["E_WARNING", "E_ERROR", "E_CORE_ERROR", "EXCEPTION"]

Notices, deprecations, and parse errors are silently ignored in production. This can hide bugs.

Fix: Include E_NOTICE, E_DEPRECATED, E_PARSE in dev; log (but don't display) them in production.


Summary

Severity Count
Critical 6
High 14
Medium 15

Key themes:

  1. Static state leakageControlScripts static arrays cause cross-request and cross-test contamination.
  2. Missing error boundaries — Cron scripts, action scripts, and event handlers run without try/catch, causing cascading failures.
  3. Unsafe file I/O — Scope updates are non-atomic; sync-map is re-read redundantly.
  4. No observability — No metrics, no tracing, no structured logging, empty error messages.
  5. Infrastructure gaps — No DB migrations, minimal CLI, no log rotation, .env fallbacks are insecure.