Extract schedule and instructor data from a section XML node
Source:R/process.R
extract_schedule_data.Rd
Processes schedule and instructor information from a Stanford course section XML node. This function extracts meeting times, locations, and detailed instructor information, combining multiple instructors' data into semicolon-separated lists.
Value
A tibble containing schedule information, or
NULL
if no schedules are found. The tibble includes:
Basic schedule information:
section_id
: Character. Section identifier (from classId)days
: Character. Days of the week (e.g., "MonWedFri")start_time
: Character. Start timeend_time
: Character. End timelocation
: Character. Meeting location
Instructor information (NA if no instructors):
instructors
: Character. Combined strings in "name (role)" formatinstructor_names
: Character. Semicolon-separated list of namesinstructor_sunets
: Character. Semicolon-separated list of SUNet IDsinstructor_roles
: Character. Semicolon-separated list of roles
Details
The function performs the following steps:
Locates all schedule nodes within the section
For each schedule:
Extracts basic schedule information (days, times, location)
Processes instructor information if present
Combines multiple instructors' data into consolidated strings
Instructor information is formatted in several ways:
Combined format: "name (role)"
Separate fields: names, SUNet IDs, and roles in semicolon-separated lists
If no instructors are found, instructor fields are set to NA
.
See also
xml2::xml_find_all()
for XML node selectionxml2::xml_text()
for text extractionextract_section_data()
for the parent function using this data
Examples
if (FALSE) { # \dontrun{
section_node <- xml2::xml_find_first(course_node, ".//section")
schedule_data <- extract_schedule_data(section_node)
} # }