Extract section and schedule data from a course XML node
Source:R/process.R
extract_section_data.Rd
Processes section-level information from a Stanford course XML node, including both section details and associated schedule information. This function handles the extraction and combination of section metadata with its corresponding schedule data.
Value
A tibble containing section and schedule
information, or NULL
if no sections are found. The tibble includes:
From section information:
objectID
: Character. Course identifier (fromcourse_id
)term
: Character. Academic termterm_id
: Character. Term identifiersection_number
: Character. Section numbercomponent
: Character. Section component (e.g., "LEC", "DIS")class_id
: Character. Unique class identifiercurrent_class_size
: Numeric. Current enrollmentmax_class_size
: Numeric. Maximum enrollment
When schedule data exists, additional columns include:
Schedule timing information
Location data
Instructor information
Details
The function performs the following steps:
Locates all section nodes within the course
For each section:
Extracts basic section information using
extract_section_info()
Extracts schedule data using
extract_schedule_data()
Joins section and schedule information if schedule data exists
Combines all section data into a single tibble
The function returns NULL
if no sections are found, allowing for courses
that may not have active sections.
Data Joining
Section and schedule data are joined using the class identifier, with
class_id
from section data matching section_id
from schedule data.
See also
extract_section_info()
for section information extractionextract_schedule_data()
for schedule data extractionprocess_courses_xml()
for the parent function using this extraction
Examples
if (FALSE) { # \dontrun{
course_node <- xml2::xml_find_first(xml_doc, "//course")
course_id <- "222796"
section_data <- extract_section_data(course_node, course_id)
} # }