APPENDIX C

Evidence Mapping

Palace Research Report · HCDE Capstone · Winter 2026

This document maps each design requirement to its supporting evidence across primary research (user interviews), secondary research (literature), and competitive analysis.

Design Requirements Traceability — Fig. 1 — Traceability from design principles to requirements and features

Design Requirement 1: Support Recognition Over Recall

Requirement: The solution should support vague recall, emphasizing visual and relational cues over precision search, enabling users to recognize what they are looking for rather than formulate exact queries.

Primary Research Evidence (User Interviews)

Participant	Quote	Source
P1	“It requires you to remember precisely the keywords that are relevant and specific ones. Like, if you remember something too broad, then it’s going to bring up 30 things for you to search through, and then that’s not very helpful.”	Winter Report, lines 248-249
P2	“Something like Obsidian’s neural map—automatic linking of related content. Instead of manually remembering connections, it reminds me.”	Interview Notes, p. 25
P2	“As a map or threaded interface showing how new information connects to previous saved content—whether it expands or contradicts it.”	Interview Notes, p. 25
P3	“What’s most frustrating—not knowing where it is, or not knowing how to query it? A bit of both, but mostly not knowing where it is.”	Interview Notes, p. 35
P4	“Sometimes I vaguely remember something from a research paper and try to find it using vague keywords on Google Scholar. That happens a lot.”	Interview Notes, p. 50
P4	“Mostly through figures—the more innovative visualization, the better. Visuals are most impactful.”	Interview Notes, p. 51

Pattern: All 6 participants reported difficulty with precision-based search when recall is vague (per Winter Report, Finding 4). Quotes from 4 of 6 are included above; P5 and P6 transcripts are unavailable for direct citation.

Secondary Research Evidence

Finding	Source
Vocabulary gaps between how users remember information and how it was originally stored reduce keyword search effectiveness.	Secondary Research, Section 3 (specific statistic unverified—no primary source identified)
“Semantic search delivers relevant answers to even vague or unconventional queries.”	Secondary Research, Section 3
Belkin’s ASK (Anomalous State of Knowledge): “The request is an incomplete, distorted expression of the underlying need.”	Agentic IR Conference Tutorial (CHIIR 2026)
“Users do not specify their needs in fleshed outcome.”	Agentic IR Conference Tutorial

Competitive Analysis Evidence

Tool	Evidence
Heptabase/Scrintal	Success of visual PKM tools validates spatial/visual modality for retrieval
Recall AI	Built entire product around semantic search for vague queries—validates problem space
Obsidian Graph View	User (P2) specifically requested “something like Obsidian’s neural map”

Evidence Strength: STRONG

6/6 interview participants experienced this problem (per Winter Report; 4 directly quoted above)
Vocabulary gap between recall and storage is well-established in IR literature
Multiple successful products built around this insight

Design Requirement 2: Preserve Persistent Project Context

Requirement: The solution should serve as the default entry surface at the start of a work session, stabilizing context and supporting ongoing reasoning. Each project should maintain its own contextual field where materials, notes, claims, and artifacts coexist within that boundary.

Primary Research Evidence (User Interviews)

Participant	Quote	Source
P1	“What starts out working—as you gain more and more information that you’re trying to store—I usually don’t adapt my system appropriately.”	Winter Report, Finding 2
P2	“Obsidian was fragile for me. Tagging was inconsistent—‘career’ vs ‘career development’ became separate tags. It got messy, so I abandoned it.”	Interview Notes, p. 26
P3	“I’m running out of storage in my primary account I’ve used for 10+ years. I switched to another Google account… Now it’s an extra task to remember which drive my docs are in.”	Interview Notes, p. 36
P3	“Sometimes my files or Excel sheets are structured in a way where I can’t remember why I set it up that way… It takes a minute—five or ten minutes—to remember.”	Winter Report, Finding 2
P4	“The failure is the gap between literature review time and writing time. That gap can be 4–5 months, 10 months, even a year. The larger the gap, the harder it is to remember.”	Interview Notes, p. 50

Pattern: Organizational systems decay over time. Context drifts. Project boundaries blur.

Secondary Research Evidence

Finding	Source
OpenAI reports Custom GPTs and Projects usage increased 19x year-to-date, with 20% of Enterprise messages via Project.	OpenAI Enterprise Report 2025
Microsoft “Frontier Firm” model: Teams form around goals, not functions—like movie production with tailored teams assembling for projects.	Microsoft 2025 Work Trend Index
“Context engineering is becoming a critical tool for unlocking the value of AI.”	QCon London 2026 (speaker/talk title unrecorded)
A-MEM (NeurIPS 2025): Proposes Zettelkasten-inspired approach where “new memories trigger updates to existing representations.”	Secondary Research, Section 2

Competitive Analysis Evidence

Tool	Evidence
Claude Code	CLAUDE.md files persist project context; auto-memory saves learnings across sessions
Windsurf	“Memories” feature that persists across sessions—ranked #1 in AI dev tools
NotebookLM/Claude Projects	19x growth in project-scoped AI tools validates demand
All AI containers	Structural limitation: require manual upload, don’t integrate across tools

Evidence Strength: STRONG

Organizational decay observed across quoted participants (P1-P4); P5 and P6 not directly quoted
19x enterprise growth in project-scoped tools
Leading AI tools (Claude Code, Windsurf) building memory/context features

Design Requirement 3: Maintain Visible Source Traceability

Requirement: Treat “ideas/claims/insights” as the primary unit of reasoning. Every idea should remain visibly connected to its originating sources. Traceability should be preserved from raw material → extracted idea → synthesis artifact.

Primary Research Evidence (User Interviews)

Participant	Quote	Source
P4	“I wouldn’t say PowerPoint is the source of truth—the code is—but PowerPoint filters the most important figures/plots for people who don’t want technical details.”	Winter Report, Finding 5
P4	“It becomes hard to cite work we used to base our research. If we don’t cite, it’s ethically wrong. And it’s frustrating to find the paper with only vague keywords—sometimes you find it, sometimes you don’t.”	Interview Notes, p. 50
P4	“Ideally I ask AI with the paper, or an AI that has all research papers: it gives the answer and points to the paper so I can cite it.”	Interview Notes, p. 51
P3	“If it’s AI-driven—hallucinations. If it gives me synthesized material not traceable to sources, that’s a concern.”	Interview Notes, p. 37
Research observation	“Synthesis is manually reconstructed in Slides, becoming the temporary source of truth. Source traceability becomes fragmented across tools.”	Winter Report

Pattern: Synthesis artifacts (slides, docs) become disconnected from sources. Users need to reverse-engineer connections.

Secondary Research Evidence

Finding	Source
A-MEM: Creates interconnected knowledge networks through dynamic indexing and linking, generating notes with contextual descriptions, keywords, and tags.	Yu, Z., et al. (2025). A-MEM: Agentic Memory for LLM Agents. arXiv:2502.12110
Faithfulness metrics for agentic evaluation: “evidence support,” “source authority score,” “source freshness,” “viewpoint diversity.”	Agentic IR Conference Tutorial
Microsoft CHI 2025: AI should be positioned as “thought partner” and “provocateur” rather than an answer-delivery system.	Tankelevitch, L., et al. (2025). Tools for Thought. CHI 2025. Microsoft Research.

Competitive Analysis Evidence

Tool	Gap
NotebookLM	Partial traceability—can cite uploaded sources but not synthesis chain
Claude Projects	Partial—cites uploaded docs but synthesis reasoning not visible
Heptabase	Strong traceability via bidirectional links, but requires full migration
All AI containers	Don’t maintain provenance from raw material → idea → synthesis

Evidence Strength: MODERATE-STRONG

2 participants explicitly mentioned traceability as critical (P3, P4)
P4’s citation problem is acute (ethically wrong not to cite)
Agentic evaluation metrics validate this as emerging standard
Gap exists in all current tools

Cognitive Load Shift — Fig. 2 — How AI shifts cognitive load from mechanical tasks to judgment

Design Requirement 4: Reduce Context Switching

Requirement: The solution should minimize the cognitive cost of retrieval by keeping users in their current context. Retrieval should feel like recognition, not research.

Primary Research Evidence (User Interviews)

Participant	Quote	Source
P2	“It distracts me. Searching exposes unexpected information. I start reading irrelevant things, which leads to procrastination.”	Interview Notes, p. 27
P2	“Maybe two screens. Right now I switch tabs and lose context. Gemini is a good example—it opens a pop-up so I can search without fully leaving my current task.”	Interview Notes, p. 27
P3	“My focus deviates from the deeper thinking needed to synthesize material and shifts to getting new information from search. Then I have to track what I left off on.”	Interview Notes, p. 36
P3	“Side panels—maybe sticky notes… Having a sticky note on the side while searching could be nice, because it can stay on the screen while Chrome is in the back.”	Interview Notes, p. 36

Pattern: Search mode disrupts cognitive flow. Users want retrieval that doesn’t require leaving current context.

Secondary Research Evidence

Finding	Source
Microsoft CHI 2025: “AI shifts cognitive work toward verification, integration, and task oversight. Workers expend more effort on high-stakes tasks, less on routine work.”	Tankelevitch, L., et al. (2025). Tools for Thought. CHI 2025. Microsoft Research.
“Retrieval breakdown is particularly costly because it occurs during in-flow project work—moments when working memory is already occupied with complex synthesis tasks.”	Winter Report, Finding 1
Average knowledge worker uses 10+ tools daily.	Winter Report, Finding 3

Competitive Analysis Evidence

Tool	Evidence
Granola	Success due to “bot-free” approach—doesn’t interrupt meetings
Gemini popup	P2 specifically cited as positive example of in-context retrieval
All AI containers	Require context switch to separate app/tab

Evidence Strength: STRONG

2/6 participants directly quoted on cognitive disruption (P2, P3); others not quoted on this topic
Users specifically requested in-context solutions (side panels, pop-ups)
Microsoft research validates cognitive load impact

Summary: Evidence Strength by Requirement

Requirement	Interview Evidence	Literature Evidence	Competitive Evidence	Overall
1. Recognition Over Recall	6/6 participants (4 quoted)	Vocabulary gap in keyword search	Visual PKM success	STRONG
2. Persistent Project Context	Decay observed (4 quoted)	19x project tool growth	Claude Code/Windsurf	STRONG
3. Source Traceability	2 explicit, observation	Agentic eval metrics	Gap in all tools	MODERATE-STRONG
4. Reduce Context Switching	2/6 quoted on disruption	CHI 2025 research	Granola success	MODERATE-STRONG

Evidence Gaps to Address

Weaker Evidence Areas

Source Traceability
Only 2 participants explicitly mentioned this need (P3, P4)
Could strengthen with more targeted questions in follow-up research
Consider: Is this more acute for academic/research users than general knowledge workers?
Project Scoping vs. Global Knowledge Base
Users expressed need for organization, but did not explicitly request “project boundaries”
This is more of a design hypothesis than user-stated need
Consider: Could this become another organizational burden?

Questions for Spring Usability Testing

Does bounded project context feel natural or constraining?
Do users actually use visual/recognition features, or revert to search?
Is source traceability visible enough to be useful, or does it add clutter?
Does ambient operation actually reduce context switching, or add cognitive load?

Appendix: Participant Tool Usage

Evidence that knowledge work is inherently multi-tool (supports need for ambient integration):

Participant	Tools Mentioned
P1	Python, Jupyter, Google Slides, Zotero, Google Sheets, Docs, Overleaf, YouTube, Calendar, Canvas, Drive, Gemini
P2	Chrome Bookmarks, Notion, YouTube, Instagram, Figma, FigJam, Granola, Google Search, Obsidian, ChatGPT, Gemini, Claude/Claude Code
P3	Google Calendar, Scholar, Docs, Word, ChatGPT, Excel, Drive, GitHub, Bookmarks, Obsidian, Apple Notes, Notepad, Sticky Notes, OneNote
P4	Python, VS Code, QGIS, ArcGIS, Slack, Jupyter, Word, PowerPoint, GitHub, OneDrive, Scholar, Mendeley, Zotero, Notion, Drive, LinkedIn, ChatGPT, Excel
P5	PyTorch, scikit-learn, AWS, GitHub, Jupyter, Google Slides, Slack, Overleaf, Scholar, Drive, Sheets, Maps
P6	Python, VS Code, QGIS, ArcGIS, Slack, Jupyter, Word, PowerPoint, GitHub, OneDrive, Scholar, Mendeley, Zotero, Notion, Drive, LinkedIn, ChatGPT, Excel, Google Earth Engine

Note: P6’s tool list may be incomplete—transcript was not preserved. Verify against researcher notes.

Average: ~14 tools per participant (P1-P5 verified; P6 approximate)