Skip to contents

Extracts fundamental section-level information from a Stanford course section XML node into a structured tibble. This function processes the core attributes of a course section, such as term information, component type, and enrollment data.

Usage

extract_section_info(section, course_id)

Arguments

section

An xml2 node object representing a single course section. Expected to contain child nodes for:

  • term

  • termId

  • sectionNumber

  • component

  • classId

  • currentClassSize

  • maxClassSize

course_id

Character string. The parent course identifier used to link section data back to the course.

Value

A tibble with one row containing:

  • objectID: Character. Course identifier (from course_id)

  • term: Character. Academic term (e.g., "Autumn", "Winter")

  • term_id: Character. Unique term identifier

  • section_number: Character. Section number within the course

  • component: Character. Section type (e.g., "LEC", "DIS", "LAB")

  • class_id: Character. Unique identifier for this section

  • current_class_size: Numeric. Current number of enrolled students

  • max_class_size: Numeric. Maximum enrollment capacity

Details

The function extracts the following section attributes using XPath:

  • Term details (term name and ID)

  • Section identification (section number, class ID)

  • Component type (e.g., lecture, discussion)

  • Enrollment information (current and maximum class sizes)

All text fields are extracted using xml_find_first() to get the first matching node. Enrollment numbers are converted to numeric format.

Error Handling

The function assumes all required nodes are present in the XML. Missing nodes will trigger an error through the tryCatch block.

See also

Examples

if (FALSE) { # \dontrun{
section_node <- xml2::xml_find_first(course_node, ".//section")
course_id <- "CS106A-2023-2024"
section_info <- extract_section_info(section_node, course_id)
} # }