Constructing curated knowledge graph structures with AI-assisted semantic processing
- Title
- Constructing curated knowledge graph structures with AI-assisted semantic processing
- Abstract
-
Housing and construction research increasingly depends on transforming diverse documents, policies, technical standards and stakeholder inputs into structured knowledge that supports analysis and decision making. Knowledge graphs offer a principled representation of this information, yet their curation remains a bottleneck: converting unstructured text into ontology-aligned, provenance-aware graph structures is labour-intensive and difficult to scale.
This study addresses this challenge by designing and evaluating an AI-assisted pipeline that constructs curated knowledge-graph fragments from real-world documents. Building on the Best Practices in Building Systems (BPiBS) project and its CIV (Collaborative Intelligence Vision) frameworks, the work defines target ontologies and exemplar subgraphs as ground truth. The pipeline integrates large language models for entity, relation and attribute extraction, followed by ontology-aware alignment, canonicalization and provenance tracking. Evaluation uses a design-science, mixed-methods approach combining quantitative metrics, entity and relation precision, recall, structural overlap and competency-question coverage, with expert review of semantic fidelity, ontology conformance and usefulness decision-support tasks.
Human-in-the-loop checkpoints are examined to identify where limited reviewer input provides the highest value. The contributions are threefold: (1) a reproducible pipeline with versioned prompts, alignment strategies and export to a Neo4j environment; (2) an evaluation protocol and benchmark artifacts for curated graph construction in housing systems; and (3) guidance on when to rely on automated extraction versus targeted human intervention. By reducing the distance between narrative evidence and computable structure, the approach supports more transparent and updateable system maps and is generalizable to other civil infrastructure domains that require translating heterogeneous text into high-fidelity, queryable knowledge. - Contributor
- Alex Dekin
- Thomas Froese
- Phalguni Mukhopadhyaya
- Wilma Leung
- Date Submitted
- 2025/12/15
- Format
- Type
- Text
- Extent
- 10 pages
- Language
- en-CA
- Identifier
- 40
- Date
- 2026/05/18

