The six voice labels

Loading…

How the rule works

The classification uses two pieces of metadata from the Cullen Project database: the letter's direction (outgoing or incoming) and the author's case role — the relationship between the letter's author and the patient discussed in the letter. Every outgoing letter is classified as cullen regardless of case role, since Cullen 'authored' all outgoing correspondence (even if it's not always his handwriting).

For incoming letters, the author's case role determines the voice. A letter from someone identified as the patient's physician is classified as attending_physician; a letter from the patient themselves as patient; a letter from a relative or friend of the patient as family; and a letter from a physician with no patient relationship as peer_physician.

When no case role is recorded but the author is flagged as a medical professional in the persons table, the letter falls back to attending_physician. Letters with no classifiable role are marked excluded and omitted from voice-stratified analyses.

Limitations

This classification operates at the letter level. When an attending physician quotes or paraphrases a patient's words — a common practice in consultation letters — the patient's voice appears within a physician-classified letter. The current version of the Explorer cannot identify these embedded voices; a future span-level annotation pass would be needed.

Approximately 2–3% of incoming letters are excluded as unclassifiable. These are primarily letters where the author's relationship to the patient could not be determined from the available metadata.