Skip to contents

Extracts fundamental course information from a Stanford course XML node into a structured tibble. This function handles the core course attributes that are common to all courses, independent of sections or schedules.

Usage

extract_basic_course_info(course)

Arguments

course

An xml2 node object representing a single course. Expected to contain child nodes for:

  • courseId

  • year

  • subject

  • code

  • title

  • description

  • unitsMin

  • unitsMax

Value

A tibble with one row containing:

  • objectID: Character. Unique course identifier from courseId

  • year: Character. Academic year

  • subject: Character. Subject code (e.g., "CS")

  • code: Character. Course number (e.g., "106A")

  • title: Character. Full course title

  • description: Character. Course description text

  • units_min: Numeric. Minimum units for the course

  • units_max: Numeric. Maximum units for the course

Details

The function extracts the following course attributes using XPath:

  • Course ID (unique identifier)

  • Academic year

  • Subject code

  • Course code (number)

  • Course title

  • Course description

  • Unit range (minimum and maximum)

All text fields are extracted using xml_find_first() to get the first matching node, with unit values converted to numeric format.

Error Handling

The function uses tryCatch to handle potential XML parsing errors. If any required node is missing or cannot be parsed, it throws an error with details about the failure.

See also

Examples

if (FALSE) { # \dontrun{
course_node <- xml2::xml_find_first(xml_doc, "//course")
basic_info <- extract_basic_course_info(course_node)
} # }