Skip to contents

Parses XML data containing Stanford University department information into a structured data frame. The function processes hierarchical XML data where departments are nested within schools, extracting department codes, full names, and their associated schools.

Usage

process_departments_xml(xml_doc)

Arguments

xml_doc

An xml2 document object containing Stanford departments data. Expected to have a structure with school nodes containing department nodes.

Value

A tibble with three columns:

  • name: Character. Department code/abbreviation (e.g., "CS")

  • longname: Character. Full department name (e.g., "Computer Science")

  • school: Character. Name of the school containing the department

Details

The function performs the following steps:

  1. Locates all school nodes in the XML using XPath

  2. For each school, extracts its name and finds all department nodes

  3. For each department, extracts:

    • Department code (name)

    • Full department name (longname)

    • Associated school name (school)

  4. Combines all departments into a single data frame

The function includes error handling for:

  • Missing school data

  • Missing department data

  • XML parsing errors

Error handling

If no schools or departments are found in the XML, an error is thrown.

See also

Examples

if (FALSE) { # \dontrun{
xml_data <- xml2::read_xml("departments.xml")
departments_df <- process_departments_xml(xml_data)
} # }