Process Stanford course XML data into a data frame

Parses XML data containing Stanford University course information into a structured data frame. The function processes detailed course data including basic course information, section details, schedules, and instructor information.

Usage

process_courses_xml(xml_doc, department)

Arguments

xml_doc: An xml2 document object containing Stanford course data. Expected to have a structure with course nodes containing section and schedule information.
department: Character string. Department code (e.g., "CS") used to identify the department for all courses in the XML.

Value

A tibble containing course information with columns:

objectID: Character. Unique course identifier
year: Character. Academic year
subject: Character. Subject code
code: Character. Course number
title: Character. Course title
description: Character. Course description
units_min: Numeric. Minimum units
units_max: Numeric. Maximum units
Additional columns for section, schedule, and instructor information when available
department: Character. Department code

NULL if no courses are found (with a warning)

Details

The function processes course data in several stages:

Locates all course nodes in the XML using XPath
For each course:
- Extracts basic course information (ID, title, units, etc.)
- Extracts section data including schedules and instructors
- Joins section data with basic course information
Adds department code to all courses

Course sections may include:

Term information
Class components (e.g., lecture, discussion)
Schedule details (days, times, locations)
Instructor information
Enrollment data

Examples

if (FALSE) { # \dontrun{
xml_data <- xml2::read_xml("cs_courses.xml")
cs_courses <- process_courses_xml(xml_data, "CS")
} # }

Usage

Arguments

Value

Details

See also

Examples