GHCP — Crawl Track v2.3.1: Exercises 0–10 (bksw_format + bookshelf hygiene) by lbakerchef · Pull Request #4186 · chef/chef-server

lbakerchef · 2026-05-18T20:33:32Z

Summary

GHCP Crawl Track v2.3.1 — Exercises 0–10 on the chef/chef-server repository.

This PR documents an AI-assisted learning track using the real production codebase.
Exercises progress from scaffolding and orientation through refactoring, validation,
docs, perf measurement, dependency audit, security hygiene, structured logging,
and CI baseline work. The chosen practice module is src/bookshelf/src/bksw_format.erl
— a pure utility module with no runtime side-effects and no downstream dependencies
inside bookshelf.

Review Focus

These are the areas that warrant the most reviewer attention.

Area	File(s)	Why it needs eyes
Input validation guards	`src/bookshelf/src/bksw_format.erl`	`when is_binary(Bin)` guards added to `to_hex/1` and `to_base64/1`; verify guard placement doesn't break existing callers passing non-binary
Structured logging	`src/bookshelf/src/bksw_wm_sql_object.erl`	3 error log lines updated to include `req_id`; verify pattern match on `#context{reqid = ReqId}` is exhaustive
Security hygiene	`.gitignore`, `spec/fixtures/pivotal.pem`	New `.pem`/`.key`/`.env*` gitignore entries; PEM header prepended — confirm header doesn't break any PEM parser that reads the fixture
Test coverage	`src/bookshelf/test/bksw_format_tests.erl`	8 tests total (5 positive + 3 negative); confirm negative tests use `?assertException` pattern correctly
Local test script	`scripts/test-bookshelf-unit.sh`	Verify `ERL_LIBS` and `erlc -I include` paths are portable across different dev setups

Changes by Exercise

Ex	Title	Files changed	Type
0	Bootstrap scaffolding	`ai-track-docs/`, `.copilot-track/crawl/`	docs
1+2	Repo orientation + deterministic tests	`SYSTEM-OVERVIEW.md`, `build-test.md`, `bksw_format_tests.erl`	docs + tests
3	Readability refactor	`bksw_format.erl`	refactor
4	Docstrings + extending guide	`bksw_format.erl`, `extending-bksw_format.md`	docs + comments
5	Input validation + negative tests	`bksw_format.erl`, `bksw_format_tests.erl`	validation + tests
6	Performance baseline	`perf-baseline.md`	docs
7	Dependency audit	`dependencies.md`	docs
8	Security hygiene	`.gitignore`, `pivotal.pem`, `setup_helper.erl`, `security-hygiene.md`	hygiene
9	Structured logging	`bksw_wm_sql_object.erl`, `logging.md`	improvement
10	CI baseline + local script	`scripts/test-bookshelf-unit.sh`, `build-test.md`	tooling + docs

Risk & Rollback

Risk tiers

Risk	Items	Severity
Behaviour change	`bksw_format.erl` guards (`when is_binary`)	🟡 Low — existing callers in `bksw_xml.erl`, `bksw_req.erl`, `bksw_wm_sql_object.erl`, `bksw_wm_object.erl` all pass `binary()` values; no non-binary caller exists in the codebase
Log format change	`bksw_wm_sql_object.erl` 3 lines	🟡 Low — format change only; no logic change; operators parsing old log format for alerting rules should update regex
PEM file header	`spec/fixtures/pivotal.pem`	🟢 Minimal — text before `-----BEGIN` is valid PEM; test suite still passes
gitignore	`.gitignore` new patterns	🟢 Minimal — already-tracked `.pem` files remain tracked; only blocks future untracked files
Docs/scripts	All other files	🟢 None — additive only

Rollback

# Revert entire branch (safest — one command)
git revert --no-commit c289a28f3..HEAD && git commit -m "revert: crawl track exercises 0-10"

# Revert only the behaviour-change commits (surgical)
git revert c833ab19d  # ex5 — remove is_binary guards
git revert 062633d1d  # ex9 — revert log line changes

# Revert only the gitignore change
git revert 67ad2fed4  # ex8 — remove secret file patterns

Verification Steps

# 1. Run bksw_format unit tests (all 8 must pass)
bash scripts/test-bookshelf-unit.sh verbose

# Expected last line:
#   All 8 tests passed.

# 2. Confirm is_binary guards reject non-binary inputs
erl -noshell -pa /tmp -eval '
  catch bksw_format:to_hex("string"),
  catch bksw_format:to_hex(42),
  io:format("guards ok~n"),
  init:stop()
'

# 3. Confirm req_id appears in error log format
grep -n "req_id" src/bookshelf/src/bksw_wm_sql_object.erl

# 4. Confirm no real credentials in tracked files
git grep -n "BEGIN RSA PRIVATE KEY" -- "*.erl" "*.rb"
# Expected: only itest/ and spec/ paths

# 5. Confirm gitignore does not un-track existing .pem files
git ls-files src/chef-server-ctl/spec/fixtures/pivotal.pem
git ls-files src/chef-server-ctl/habitat/config/pivotal.pem
# Both must print the path (still tracked)

Evidence

All 8 unit tests passing

======================== EUnit ========================
module 'bksw_format_tests'
  to_hex_produces_lowercase_hex_string_test...[0.003 s] ok
  to_hex_empty_binary_produces_empty_string_test...ok
  to_hex_rejects_string_input_test...ok
  to_hex_rejects_integer_input_test...ok
  to_base64_encodes_binary_test...[0.014 s] ok
  to_base64_rejects_string_input_test...ok
  to_etag_wraps_hex_in_quotes_test...ok
  to_date_undefined_returns_epoch_test...ok
  [done in 0.041 s]
=======================================================
  All 8 tests passed.

Performance baseline (key numbers)

Function	Input	avg latency
`to_hex/1`	16 B	20 µs
`to_hex/1`	1 024 B	1 485 µs
`to_base64/1`	1 024 B	23 µs

No optimization action required.

Security audit summary

✅ No hardcoded production credentials
✅ Bookshelf uses chef_secrets:get/2 at runtime
✅ oc_chef_action redacts "password" field from audit logs
🟡 4 deps branch-tracked in rebar.lock (see ai-track-docs/dependencies.md)
🟡 Ruby 3.1 and Redis 5.x are EOL (noted, out of scope for this track)

Proposed Commit Message Improvements

Two commits have suboptimal messages. Proposed rewrites (for reference — not
amended in this PR since history is already pushed):

c289a28f3 — missing exercise prefix:

# Current
ghcp(crawl): ex0 bootstrap scaffolding

# Proposed
exercise 0 — ghcp(crawl): bootstrap scaffolding for Crawl track

Added ai-track-docs/ (SYSTEM-OVERVIEW, build-test, architecture.mmd)
and .copilot-track/crawl/README.md with chain-PR conventions and
evidence block template.

d83daa927 — says "exercise 2" but covers exercises 1 and 2:

# Current
exercise 2 — ghcp(crawl): ex1+ex2 repo orientation, low-risk module, deterministic tests

# Proposed
exercise 1+2 — ghcp(crawl): repo orientation, module selection, deterministic tests

- Ex1: service map, language table, entry points, test approach in SYSTEM-OVERVIEW.md
- Ex2: verified bookshelf build commands, bksw_format chosen as low-risk module,
  bksw_format_tests.erl with 5 deterministic EUnit tests (all passing)

Future exercises: the exercise N — ghcp(crawl): <verb> <what> pattern is
working well; keep subject line under 72 chars and use the body for why.

Track

Level: Crawl
Version: v2.3.1
Exercises completed: 0–10 of 15
Branch: learn/crawl/lbaker-ex0-bootstrap
Practice module: src/bookshelf/src/bksw_format.erl

netlify · 2026-05-18T20:33:40Z

👷 Deploy Preview for chef-server processing.

Name	Link
🔨 Latest commit	`a6bc5d6`
🔍 Latest deploy log	https://app.netlify.com/projects/chef-server/deploys/6a0b940078ef25000887411f

…deterministic tests - Updated SYSTEM-OVERVIEW.md with language/framework summary, entry points, test approach, and chosen low-risk module (bksw_format.erl) with justification - Updated build-test.md with exact bookshelf build/test commands, verified test output, and two new troubleshooting entries (GCC segfault, rebar3 path) - Added src/bookshelf/test/bksw_format_tests.erl with 5 deterministic EUnit tests covering all 4 exported functions in bksw_format (to_hex, to_base64, to_etag, to_date); all 5 tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Extract byte_to_hex/1 private helper from to_hex/1 list comprehension; intent is now readable at a glance without parsing nested format calls - Remove no-op string:to_lower/1 call (2.16.0b format already produces lowercase) - Add -spec attributes for all 4 public functions and byte_to_hex/1 - All 5 existing EUnit tests pass; behavior unchanged Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@doc

- Add @doc comments to all functions in bksw_format.erl explaining purpose, input shapes, edge cases, and examples - Mark byte_to_hex/1 as @Private - Create ai-track-docs/extending-bksw_format.md with guidance on when/how to add new functions, current exports reference, a worked example, and module health guidelines - All 5 EUnit tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… for bksw_format - Added is_binary/1 guards to to_hex/1 and to_base64/1 - Three new negative tests: rejects string and integer inputs - All 8 tests passing (5 positive + 3 negative) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- timer:tc escript benchmark, 10k iterations per cell - to_hex/1: 17–1485 µs avg across 16/128/1024 byte inputs (linear O(n)) - to_base64/1: 1–23 µs avg — stdlib base64 NIF, ~64x faster - Variance notes: WSL2 scheduler jitter, JIT first-call spike documented - No optimization recommended; no hotspot identified Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…kshelf - 4 deps branch-tracked in rebar.lock (not reproducibly pinned): erlcloud, mini_s3, erlsom, sqerl — all on personal/feature branches - 10 deps branch-tracked in config but ref-frozen in lock (safe today) - Only iso8601 (tag) and meck (ref) are properly pinned in rebar.config - OTP version mismatch: rebar.config requires 26.2.5.15, omnibus override has 26.2.5.14 commented out - Platform debt noted: Ruby 3.1 EOL, Redis 5.x EOL, omnibus-ctl on main - No files modified; audit-only Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Audit (security-hygiene.md): - No hardcoded production credentials found in src/ - Bookshelf uses chef_secrets:get/2 for runtime secret loading (correct) - oc_chef_action properly redacts password field from audit logs (correct) - Habitat .pem files are template placeholders, not real keys (confirmed safe) - AWS key in bksw_sec_tests is AKIAIOSFODNN7EXAMPLE docs placeholder (safe) Remediations: - .gitignore: added *.pem, *.key, .env*, *.secret, *.credentials and SSH key name patterns as safety net for future untracked secret files - spec/fixtures/pivotal.pem: prepended test-only header (valid PEM format) to prevent false positives in secret scanners - itest/setup_helper.erl: added TEST FIXTURE comment on zuperzecret literal Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…sql_object Three lager:error calls lacked req_id trace context despite #context{} being in scope. These are integrity-critical events (checksum/MD5 mismatches, upload failures) that operators need to correlate to specific requests. Changes: - send_streamed_body/1: add reqid=ReqId to pattern; log as key=value fields - write_streamed_body/3: extract Ctx1#context.reqid into content-md5 mismatch log - maybe_upload/3: promote error_logger to lager:error; add req_id to pattern; use key=value field format All three now emit: req_id=<id> <event>: field=value ... Docs: ai-track-docs/logging.md — logging stack, req_id convention, before/after, best practices table Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ation CI finding: GitHub Actions runs security scans (trufflehog, trivy, grype, BlackDuck, SonarQube) but build=false and unit-tests=false — Erlang unit tests, dialyzer, and elvis are NOT executed by CI. Local script: scripts/test-bookshelf-unit.sh - Bypasses GCC segfault on jiffy C NIF compilation (pre-existing system bug) - Uses ERL_LIBS + erlc to compile bksw_format + bksw_format_tests directly - Runs eunit:test via erl -noshell with proper exit code propagation - Cleans up temp files on exit; accepts 'verbose' argument - All 8 tests pass: exit 0 Updated ai-track-docs/build-test.md: - CI section rewritten with what runs vs what does not - Local script usage documented - Full local quality gate checklist Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Updated PR title and body with: - Review focus table: 5 items with rationale (guards, logging, PEM, gitignore, script) - Risk tiers: behaviour-change (low), log format (low), gitignore/docs (minimal) - Rollback: full-branch revert + surgical per-commit options - Verification: 5 runnable shell checks with expected output - Evidence: EUnit output, perf table, security audit summary - Commit message improvements proposed for c289a28 and d83daa9 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ings BACKLOG-001 (High/M): Pin 4 branch-tracked rebar.lock deps to SHA refs — erlcloud, mini_s3, erlsom, sqerl all on personal branches in lock file BACKLOG-002 (High/S-M): Enable Erlang unit tests in GitHub Actions CI — build: false / unit-tests: false; no automated test gate on PRs BACKLOG-003 (Med/S): Propagate req_id to 14 remaining untraced error logs — bksw_wm_sql_object (10) + bksw_wm_object (3) + bksw_io (1) BACKLOG-004 (Med/S): Resolve OTP version mismatch omnibus vs rebar.config — rebar.config requires 26.2.5.15; commented omnibus override says .14 BACKLOG-005 (Med/S): Add trufflehog/gitleaks allowlist for test PEM fixtures — 5 tracked .pem files will trip scanner; no machine-readable allowlist Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…o_hex/1 Adds a runtime-configurable toggle for hex letter case via the bookshelf application environment: application:set_env(bookshelf, hex_encoding_case, uppercase) % -> "0AFF" application:set_env(bookshelf, hex_encoding_case, lowercase) % -> "0aff" (default) Design: - application:get_env/3 called once per to_hex/1 (not per byte) - string:to_upper/1 only invoked on uppercase path (zero cost default) - Any unknown value falls through to lowercase (fail-safe) - No restart required; takes effect on next call Tests (2 new, 10 total — all passing): - to_hex_uppercase_toggle_produces_uppercase_test: set env, assert, always unset - to_hex_default_is_lowercase_when_toggle_unset_test: confirm default Docs: ai-track-docs/feature-toggles.md — config, risks, ETag consistency warning Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Elvis: 0 violations across all 20 bookshelf src/*.erl files (9 rules). All Ex3-13 changes to bksw_format.erl and bksw_wm_sql_object.erl pass clean. Dialyzer on bksw_format.erl (with iso8601 dep): exit 0, no warnings. All -spec attrs consistent with actual return types including the Ex13 string:to_upper/1 toggle path. bksw_wm_sql_object.erl: 'Unknown functions' warnings are expected (sibling modules not in minimal PLT) — no defects in our Ex9 code. Pre-existing finding: mochiweb_http.erl:79 unreachable pattern in 3rd-party mochiweb v2.x; not introduced by our changes. Gaps documented: no Elvis/Dialyzer in CI, full PLT blocked by jiffy NIF segfault. See ai-track-docs/static-analysis.md for full findings and run instructions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Four new tests (14 total, all passing): 1. to_hex_unknown_toggle_value_falls_back_to_lowercase_test Proves the wildcard clause in to_hex/1's case expression fires on unrecognised hex_encoding_case values — bad config never crashes. 2. to_etag_empty_binary_produces_empty_quoted_string_test Edge case: to_etag(<<>>) produces "" (valid ETag for empty object). 3. to_hex_large_binary_does_not_crash_test Boundary: 1 KiB binary (1024 bytes) encodes to exactly 2048 hex chars with correct content — no crash, no truncation. 4. to_date_datetime_tag_unwraps_and_formats_test Previously untested code path: the {datetime, Date} tagged form of to_date/1 was missing coverage. Verifies the unwrap clause delegates to iso8601 and returns a binary ISO 8601 string containing the date. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ghcp(crawl): ex0 bootstrap scaffolding

c289a28

lbakerchef requested review from a team as code owners May 18, 2026 20:33

lbakerchef and others added 9 commits May 18, 2026 16:07

lbakerchef changed the title ~~GHCP — Crawl: Ex0 Bootstrap~~ GHCP — Crawl Track v2.3.1: Exercises 0–10 (bksw_format + bookshelf hygiene) May 18, 2026

lbakerchef and others added 5 commits May 18, 2026 17:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GHCP — Crawl Track v2.3.1: Exercises 0–10 (bksw_format + bookshelf hygiene)#4186

GHCP — Crawl Track v2.3.1: Exercises 0–10 (bksw_format + bookshelf hygiene)#4186
lbakerchef wants to merge 15 commits into
mainfrom
learn/crawl/lbaker-ex0-bootstrap

lbakerchef commented May 18, 2026 •

edited

Loading

Uh oh!

netlify Bot commented May 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lbakerchef commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review Focus

Changes by Exercise

Risk & Rollback

Risk tiers

Rollback

Verification Steps

Evidence

All 8 unit tests passing

Performance baseline (key numbers)

Security audit summary

Proposed Commit Message Improvements

Track

Uh oh!

netlify Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👷 Deploy Preview for chef-server processing.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lbakerchef commented May 18, 2026 •

edited

Loading

netlify Bot commented May 18, 2026 •

edited

Loading