In real-world scenarios with naturally occurring datasets, reference summaries are noisy and may contain information that cannot be inferred from the source text.
The records of a clinical encounter can be extensive and complex, thus placing a premium on tools that can extract and summarize relevant information.
We reframe suicide risk assessment from social media as a ranking problem whose goal is maximizing detection of severely at-risk individuals given the time available.
The vast majority of research in computer assisted medical coding focuses on coding at the document level, but a substantial proportion of medical coding in the real world involves coding at the level of clinical encounters, each of which is typically represented by a potentially large set of documents.
We report on the creation of a dataset for studying assessment of suicide risk via online postings in Reddit.