Extract basic course information from XML node

Extracts fundamental course information from a Stanford course XML node into a structured tibble. This function handles the core course attributes that are common to all courses, independent of sections or schedules.

Usage

extract_basic_course_info(course)

Arguments

course

An xml2 node object representing a single course. Expected to contain child nodes for:

courseId
year
subject
code
title
description
unitsMin
unitsMax

Value

A tibble with one row containing:

objectID: Character. Unique course identifier from courseId
year: Character. Academic year
subject: Character. Subject code (e.g., "CS")
code: Character. Course number (e.g., "106A")
title: Character. Full course title
description: Character. Course description text
units_min: Numeric. Minimum units for the course
units_max: Numeric. Maximum units for the course

Details

The function extracts the following course attributes using XPath:

Course ID (unique identifier)
Academic year
Subject code
Course code (number)
Course title
Course description
Unit range (minimum and maximum)

All text fields are extracted using xml_find_first() to get the first matching node, with unit values converted to numeric format.

Error Handling

The function uses tryCatch to handle potential XML parsing errors. If any required node is missing or cannot be parsed, it throws an error with details about the failure.

Examples

if (FALSE) { # \dontrun{
course_node <- xml2::xml_find_first(xml_doc, "//course")
basic_info <- extract_basic_course_info(course_node)
} # }