Technical


Caffeinated Professor lecturer / textbook agent

  • Source-grounded retrieval over authored materials via embedding (RAG) and markdown (MD/second mind). Training content indexed as searchable chunks (lecture transcripts aligned to timestamps, slides with headings, and textbook sections) or as MD files for routing (natural language summaries of concepts). Queries retrieve the most relevant passages (RAG) or cards (MD) before answer generation.
  • Cascading fidelity. Grounded in authored sources (recorded course lectures, slides, textbook, published papers), the agent provides verifiable citations, and where answers cannot be derived directly from source materials, the fidelity decrease is clearly marked.
  • Citations/timestamps in responses. Responses include a short “Sources” block with 1 to 3 pointers to the exact material used (e.g., “Lecture 3 @ 12:40–14:05”, “Slides: Stakeholder Theory—Objections”, “Textbook Ch. 5 §2”). Citations use stable IDs to remain valid across transcript re-exports and content updates (e.g., L03_T12m40s, SLD_Stakeholder_Objections, TXT_BE_CH5_S2).
  • If system cannot locate strong sources, clarifying questions are asked before relying on general knowledge.
  • Attribution integrity checks (anti-hallucination). Citations validated against indexed corpus. System not allowed to invent timestamps, slide titles, or readings. When retrieval confidence is low, it cascades down to “limited answer” mode.
  • Subject matter guardrails. Queries and inputs filtered for relevance. Interactions contained within the scope of the authored content.
  • Token burn. User authentification. Hard ceilings on user, and on platform.
  • Mimetic style layer. A constrained persona prompt (+ a library of Q&A exchanges) enforce professor’s characteristic vocabulary and teaching, while preventing drift into a generic tutor voice.
  • Privacy-by-design deployment. Data-minimization beyond that necessary for learning analytics and improvement; configurable data-retention policies suitable for institutional environments, and international compliance. (Tool currently operating in Europe /compliant with EU regulation.)

Caffeinated Professor assessment engine

After posing a question and receiving a response, the distance between the student response and a predicted proficient answer is calculated. To reduce the distance, the agent provides cues for a subsequent response. The cycle continues until reaching the standard. The resulting grade derives from the nudges required to reach proficiency.

  • Distance calculation measures a proficient response against a student response.
    1. Question submitted by human professor or generated from a source-grounded retrieval over course materials.
      Proficiency response provided by human professor or agent-predicted from source materials.
    2. Student response prepared for analysis by language embedding or markdown summary.
    3. Distance calculation between language vectors or markdown summaries is marked in semantic or content space.
      Integrity check: Engine validates the student's answer against the indexed corpus.
      Risk: System mostly measures semantic style/verbosity, so a correct answer phrased differently and with synonyms looks “far” while a wrong answer composed from accurate keywords looks “close.”
  • Scaffolding responds to the distance between the proficient response and the student's response by leading the student through discussion toward proficiency.
    1. Nudge: The agent provides a clarifying question or a targeted hint based on a specific slide or textbook section.
      Anti-Hallucination: Hints validated against the indexed corpus so they don't lead toward incorrect or generic answers.
      Cycle repeats.
  • Grading is a function of the quantity of hints required to reach proficiency.