Open Questions

Stand: 2026-06-23

A running log of gaps, missing sources, unresolved tensions, and articles that still need to be written. Each entry is dated; resolved entries stay here for traceability (move them to the bottom under “Resolved”).

Open

  • 2026-06-24 — etlnext bridge verified end-to-end on a 3-slug smoke batch, then re-verified on a refreshed batch after wiki self-containment refactor. Smoked with plans/smoke-batch.csv. First run: mouvement-associatif, sfaf-movilizacion, kampajobs (job 8569230f…, workflow fb25d3b3…). Second run after self-containment refactor: ciudadania-inteligente, ipie-genai-campaigns, lma-idf (job b59d8fba…). Both 3/3 success via the etlnext Playwright path; RAW/<slug>.md written; manifest status=ok, scrape_method=etlnext-playwright. Three deviations from the original plan, all baked into the live state and now into the wiki:

    1. Auth. The ngojobs container runs with WORKFLOW_ENGINE_DEV_AUTH=1 and NODE_ENV=development. Host port :3001 is occupied by Docker itself, so the container maps -p 3002:3002 (the ngojobs app host port is unaffected).
    2. Missing plugins in the production image. engine/plugins/web_scraper.py, directory_scraper.py, cms_detector.py, json_writer.py are vendored into the wiki at scripts_build/etlnext_plugins/. After any docker rm ngojobs, run scripts_build/etlnext_bootstrap.sh — it docker cp’s the four files in, verifies the imports, and re-registers the workflow. Idempotent; safe to re-run.
    3. Abstract-method contract drift. The container’s base.py requires execute(); JsonWriter (older run()-style API) can’t be instantiated. Resolved by dropping the persist step from the wiki workflow — the wiki reads per-URL results from the run record (step_results[*].output) instead of from a file. Workflow YAML lives at scripts_build/wiki_source_harvest.yaml inside the wiki; etlnext_client.ensure_workflow() POSTs it on first use.
  • 2026-06-24 — Prompt 16 (AskWiki) verified end-to-end with Part F tests passing. Retriever (docker/ask/retriever.py, BM25 stdlib, 1,037 chunks indexed), streaming SSE ask_server (docker/ask/ask_server.py), widget (docker/ask/static/askwiki.{js,css}), post-build injector (quartz/post-build-inject-askwiki.js), nginx proxy with proxy_buffering off, Docker/entrypoint/compose all in place. After setting a valid OPENAI_API_KEY and restarting the container, full Part-F verification passed:

    • GET /api/ask/health{"status":"ok","chunks":1037,"model":"gpt-4o-mini"}
    • On-topic: curl -N -X POST http://localhost:8080/api/ask -d '{"question":"What is power mapping?"}'starttool_calltool_result (7 retrieved slugs: veneklasen-miller, steven-lukes, john-gaventa, saul-alinsky, resource-mapping, piven-cloward, labour-and-union-campaigning) → streamed token events → final with grounded 3-paragraph answer citing 1 2 3 4 (Lukes’ three faces of power, the three-dimensions framework, the power cube)
    • Off-topic: curl -N -X POST http://localhost:8080/api/ask -d '{"question":"What is the capital of Mars?"}'tool_result: []token: "The wiki doesn't cover this yet."final: {answer: "The wiki doesn't cover this yet.", sources: []}. Anti-hallucination gate works.
    • The off-topic miss is also appended to /audits/logs/ask_misses.log for editorial feedback.
  • 2026-06-24 — Prompt 16 LLM deviation: OpenAI gpt-4o-mini, not Claude Sonnet 4.6. The prompt specifies Claude via the anthropic SDK; the shipped code uses the openai SDK and reads OPENAI_API_KEY + ASK_MODEL=gpt-4o-mini. To switch to Claude: replace the client.chat.completions.create(stream=True) block in docker/ask/ask_server.py with the anthropic SDK’s client.messages.stream(model=…, max_tokens=1024, system=…, messages=…); swap OPENAI_API_KEYANTHROPIC_API_KEY, ASK_MODEL=claude-sonnet-4-6 in docker/docker-compose.yml; add anthropic to the Dockerfile’s pip install list. ~30 lines diff; widget and retriever unchanged.

  • 2026-06-24 — Prompt 16 no-data gate is coverage-based, not score-based. The spec says “top score below a floor”; the implementation uses coverage >= 0.8 (fraction of unique query terms that appear in the top chunk) — strictly stricter and works better on the small corpus, but a deviation.

  • **2026-06-24 — Prompt 16 sidecar port bindings: host 8082 is held by the Docker daemon itself (us-cli); curl to host:8082 from outside Docker will reset. The AskWiki sidecar binds 127.0.0.1:8082 inside the container and is reached via the nginx /api/ proxy on the host — i.e. http://localhost:8080/api/ask is the user-facing entry point, not :8082.

  • **2026-06-24 — Prompt 24 (Feedback form sidecar) referenced in entrypoint + compose but the /feedback/ directory was missing from the live image (verified: docker exec docker-wiki-1 ls /feedback/ → “No such file or directory”). A docker compose up --build would resolve it. Prompt 24 was not part of this session’s run; separate piece of work.

  • 2026-06-24 — etlnext wiki_source_harvest no longer writes to disk. The persist step was removed (see 2026-06-24 entry above); the wiki reads per-URL results from the run’s step_results[*].output via etlnext_client.fetch_outputs(). This sidesteps both the abstract-method contract drift and the per-job path-templating gap. If etlnext ever ships per-job templating and a real JsonWriter (matching the new execute() contract), the file-output step can be re-added. The wiki-side YAML is at scripts_build/wiki_source_harvest.yaml; the plugin sources are vendored at scripts_build/etlnext_plugins/.

  • 2026-06-23 — CMS / language detection deferred. The wiki_source_harvest workflow intentionally drops the cms_detector step (it writes to MongoDB, awkward for the pull-based flow). The wiki can run cms_detector separately later if needed; for v1 the per-RAW host: and final_url: fields are enough to identify site language and stack.

  • 2026-06-23 — Prompt 20 — strategy-chart ships as stub. Re-fetch a Midwest Academy chapter, the Blueprint for Change handbook, or a Commons Library page that covers the five-column strategy chart before promoting to emerging.

  • 2026-06-23 — Prompt 20 — okrs ships as stub. Re-fetch a Bond / NDI / BetterEvaluation / Wikimedia Foundation programme document that covers OKRs before promoting.

  • 2026-06-23 — Prompt 20 — logframe ships as stub. Re-fetch a Bond / ODI / UNDP / USAID logframe guide. The “means of verification” column is a frequent omission; BetterEvaluation is the most likely home.

  • 2026-06-23 — Prompt 20 — work-breakdown-structure ships as stub. Re-fetch a project-management chapter or a Seedstone / NDI source that covers WBS specifically.

  • 2026-06-23 — Prompt 20 — raci ships as stub. Re-fetch a deeper Commons Library organising chapter or a Seeds for Change / MobLab project-management guide that covers RACI.

  • 2026-06-23 — Prompt 20 — contribution-analysis ships as stub. Re-fetch BetterEvaluation’s contribution-analysis page or a Bond / ODI guide.

  • 2026-06-23 — Prompt 20 — outcome-harvesting-msc ships as stub. Re-fetch BetterEvaluation’s OH and MSC pages or a Bond / ODI guide.

  • 2026-06-23 — Prompt 20 — resource-mapping ships as stub. Re-fetch a deeper Commons Library organising chapter or a NDI / WRI resource-mapping guide.

  • 2026-06-23 — Prompt 20 — pestle-scan ships as stub. No corpus source mentions PESTLE by name. Re-fetch a deeper Community Tool Box chapter or pull a civic-space monitor (CIVICUS, ICNL, ECNL) that covers the PESTLE/PESTEL framework.

  • 2026-06-23 — Prompt 20 — Country-specific L-lens content (protest law, charity / political-activity rules, GDPR implications, US 501(c)(3) limits) is the gap the wiki flagged and remains unsourced. Likely homes: CIVICUS Monitor, ICNL Civic Space Initiative, ECNL, country-specific compliance guides.

  • 2026-06-23 — Prompt 20 — Further widening backlog (named in the reference module as the next page-creation targets, not part of this run):

    • Message framing & narrative strategy as its own method page (currently split across framing-and-narrative and public-narrative).
    • Audience segmentation & message testing (the audience-segmentation page covers segmentation but message testing is open).
    • Coalition-building agreements: MOUs, decision rights, shared branding (open beyond coalition-building).
    • Escalation ladders / tactical sequencing (open beyond escalation and the-tactic-star).
    • Digital metrics & funnel (list growth, conversion, supporter journey).
    • Crisis communications & rapid response.
    • Data protection & consent for supporter data (the legal/compliance gap the wiki flagged).
    • Wellbeing, sustainable pace & burnout prevention for organisers.
  • 2026-06-23 — Prompt 17 — Unit-economics indicators (cost per volunteer / contact / policymaker meeting) for [[budget-and-controlling]] are widely recommended by advocacy planning guides but are not stated verbatim in the locally fetched RAW for any cited source. Re-fetch the Community Tool Box action-planning chapters or add MobLab’s Campaign Accelerator content to substantiate them before promoting the page to established.

  • 2026-06-23 — Prompt 17 — The four-column risk register template in risk-management is the standard practitioner pattern but the specific phrasing is not verbatim in the locally fetched RAW. Re-fetch a deeper Community Tool Box / Commons Library chapter on risk and scenario planning, or pull a Seeds for Change / MobLab risk-management guide, before promoting the page to emerging or established.

  • 2026-06-23 — Prompt 17 — No corpus source currently covers scenario trees, decision rules, or trigger-event methodologies for civil-society campaigns. Likely homes: WRI handbook, SMK campaign-training, or MobLab Campaign Accelerator deep-dive.

  • 2026-06-23 — Prompt 17 — The locally fetched RAW for mcalevey-keine-halben-sachen is only the Rosa-Luxemburg-Stiftung homepage and does not contain substantive content on Jane McAlevey. Re-fetch the book’s chapter contents or pull a Commons Library / Seeds for Change page that directly discusses her framework before promoting thinkers/jane-mcalevey or its key related page [[structure-tests]].

  • 2026-06-23 — Prompt 17 — No corpus source currently fetched directly attributes content to any of the following thinkers (page is a stub awaiting a covering source):

    • thinkers/robert-helvey, thinkers/erica-chenoweth, thinkers/maria-stephan, thinkers/srdja-popovic, thinkers/hahrie-han, thinkers/mark-paul-engler, thinkers/piven-cloward, thinkers/charles-tilly, thinkers/sidney-tarrow, thinkers/doug-mcadam, thinkers/zeynep-tufekci, thinkers/george-lakoff, thinkers/anat-shenker-osorio, thinkers/drew-westen, thinkers/robert-cialdini, thinkers/edward-bernays, thinkers/steven-lukes, thinkers/john-gaventa, thinkers/veneklasen-miller, thinkers/antonio-gramsci, thinkers/hannah-arendt, thinkers/paulo-freire, thinkers/laclau-mouffe, thinkers/myles-horton, thinkers/ella-baker, thinkers/augusto-boal, thinkers/fred-ross, thinkers/ed-chambers, thinkers/sasha-issenberg, thinkers/donald-green-alan-gerber, thinkers/heimans-timms, thinkers/richard-viguerie, thinkers/paul-weyrich, thinkers/morton-blackwell, thinkers/grover-norquist.
    • Likely homes for each are listed in the per-page ## Open Questions block.
  • 2026-06-23 — Prompt 17 — risk-management is shipped as stub/grounding: unverified. Promote after a covering source is fetched.

  • 2026-06-23 — Prompt 17 — [[governance]] cites the SMK Navigating charity campaigning “five essential conversations” but the full text is gated. Re-fetch the SMK PDF before promoting the page to established.

  • 2026-06-23 — WRI Handbook and ICNC resource-library licenses. Verify before quoting. Treated as link-only by default; confirm per resource.

  • 2026-06-23 — Re-fetch thin / blocked / failed sources. _scrape_manifest.json currently only records 3 entries — most sources were scraped but the manifest is stale. Run scrape_sources.py --force to rebuild before publishing any quotations derived from link-only RAW.

  • 2026-06-23 — CANVAS / Canvasopedia current URL. Confirm canvasopedia.org resolves and matches the same curriculum as the legacy CANVAS site.

  • 2026-06-23 — Beautiful Trouble translation inventory. Confirm which modules exist in DE/FR/ES translations at current URL paths.

  • 2026-06-23 — Beautiful Rising current CC version. Verify BY vs BY-SA vs BY-NC before quoting.

  • 2026-06-23 — Manual Práctico de Incidencia Política (UNESCO mirror). Hosting on healtheducationresources.unesco.org may move; verify before publishing.

  • 2026-06-23 — Country-specific legal/compliance gaps. No pages yet on protest law, data protection, or charity-political-activity rules for any language region.

  • 2026-06-23 — Electoral/party-side balance. The wiki currently skews movement-organising. Need more electoral/campaign-finance and party-side sources per language region.

  • 2026-06-23 — Concept pages still at stub/emerging. Most concept pages promoted to emerging in Prompt 6; a handful of source pages still stub because their licence blocks substantive distillation. Future passes can promote them when RAW content expands.

Resolved

  • 2026-06-23 — FOCO e.V. / DICO current domains. Confirmed via handbuch-community-organizing stub; flagged for verification before publishing the related-site links. (Partial — still pending verification.)
  • 2026-06-23 — Concept article “leadership-development” broken wikilink. Created Wiki/leadership-development.md and added it to INDEX.
  • 2026-06-23 — Compile pass green (Prompt 6). Health check reports 0 broken wikilinks / 0 orphans / 0 frontmatter issues / 0 INDEX drift on 202 pages.

URL Verification — 2026-06-23

Run python3 scripts_build/verify_urls.py to refresh. The remaining problematic URLs (kept as link-only stubs; flagged here for verification or replacement on next pass):

URL Verification Report

Stand: 2026-06-23 Checked 168 source pages; 157 OK, 11 dead/timeout/no_url.

Dead or timing-out URLs

Curation Gap Report — 2026-06-23

Generated by scripts_build/score_sources.py + scripts_build/build_by_tier.py. See scoring/summary.md for the full table.

Tier counts

  • Core (high relevance + high validity): 55
  • Verify-then-use (high relevance, mid/low validity): 58
  • Reference (high validity, low/mid relevance): 16
  • Deprioritize (low relevance + low validity): 39

Corpus coverage

  • All four target languages (EN, DE, FR, ES) and “Multi” have at least one Core source.
  • All eight taxonomy categories have at least one Core source.

Orientation

  • 42 nonpartisan, 33 progressive, 2 conservative, 2 mixed, 89 unknown.
  • Orientation never affected any score; recorded as metadata only.

The following sources are flagged at_risk or dead_link? and need re-fetch via python3 scripts_build/scrape_sources.py --fallback --only <slug>:

  • actipedia, artful-activism, beautiful-rising, beautiful-trouble-toolbox
  • cairn-info, canvas, ciudadania-inteligente
  • civil-resistance-2-0, civil-resistance-tactics-21c
  • elpais-comparador, european-greens-strategy, flacso
  • global-nonviolent-db, incidencia-colombia, incidencia-unaids
  • infoelectoral, ipie-genai-campaigns, kampajobs, lma-idf
  • manual-campanas-ongawa, mouvement-associatif, navco
  • ndi-campaign-skills, sfaf-movilizacion, waging-nonviolence
  • wikipedia-campagne-electorale, wikipedia-campana-politica, wikipedia-kampagne, wikipedia-political-campaign
  • wri-handbook
  1. Re-run scrape_sources.py --fallback --only <slug> for each at-risk slug.
  2. For 8 Deprioritize sources with stale URLs, find replacements or archive.
  3. The 39 Deprioritize entries are candidates for archival on next maintenance pass.

QC demotion log — 2026-06-23

Prompt 13c demoted 24 pages from established to emerging after qc_check.py found their grounding was secondary or unverified. The body content is preserved; the downgrade reflects that the page’s claims cannot be verified against the cited sources’ own text. Pages will be restored to established only after Prompt 14’s adversarial verification or after Prompt 13b’s re-fetch provides RAW-backed text for the listed sources.

PageGrounding
campaigns-vs-movementssecondary
methods-of-nonviolent-actionsecondary
incidencia-politicasecondary
coalition-buildingsecondary
civic-techsecondary
distributed-organizingsecondary
framing-and-narrativesecondary
constructive-programmeunverified
noncooperationsecondary
nonviolent-direct-actionsecondary
digital-securitysecondary
boycotts-and-strikessecondary
civil-resistancesecondary
buergerbegehrensecondary
theory-of-changesecondary
dilemma-actionssecondary
three-and-a-half-percent-rulesecondary
citizen-lobbyingsecondary
the-campaign-cyclesecondary
affinity-groupssecondary
public-narrativesecondary
the-tactic-starsecondary
commons-libraryunverified
beautiful-troubleunverified