Extract section and schedule data from a course XML node

Processes section-level information from a Stanford course XML node, including both section details and associated schedule information. This function handles the extraction and combination of section metadata with its corresponding schedule data.

Usage

extract_section_data(course, course_id)

Arguments

course: An xml2 node object representing a single course. Expected to contain child section nodes, each potentially containing schedule information.
course_id: Character string. The course identifier used to link section data back to the parent course.

Value

A tibble containing section and schedule information, or NULL if no sections are found. The tibble includes:

From section information:

objectID: Character. Course identifier (from course_id)
term: Character. Academic term
term_id: Character. Term identifier
section_number: Character. Section number
component: Character. Section component (e.g., "LEC", "DIS")
class_id: Character. Unique class identifier
current_class_size: Numeric. Current enrollment
max_class_size: Numeric. Maximum enrollment

When schedule data exists, additional columns include:

Schedule timing information
Location data
Instructor information

Details

The function performs the following steps:

Locates all section nodes within the course
For each section:
- Extracts basic section information using extract_section_info()
- Extracts schedule data using extract_schedule_data()
- Joins section and schedule information if schedule data exists
Combines all section data into a single tibble

The function returns NULL if no sections are found, allowing for courses that may not have active sections.

Data Joining

Section and schedule data are joined using the class identifier, with class_id from section data matching section_id from schedule data.

Error Handling

If section data extraction fails, the function throws an error with details.

Examples

if (FALSE) { # \dontrun{
course_node <- xml2::xml_find_first(xml_doc, "//course")
course_id <- "222796"
section_data <- extract_section_data(course_node, course_id)
} # }