摘要: |
National databases that collect various kinds of textual threat reports such as ASRS, CERT, and
NVD manually process their reports individually. They then offer data products to disseminate
the aggregate information, like newsletters, alerts or individual report searching. The goal of this
research is to connect these individual reports thematically and temporally to identify emerging or
recurring threats, by analyzing large collections of text, source code, collaboration and communica-
tion patterns. This capability, I argue, enables us to identify the emergence and recurrence of such
themes, and the contexts in which they re-occur, facilitating faster and more capable mitigation. I
propose two models to shed light on this goal: An empirical model of vulnerabilities as bugs, the
commit flow model, and one of the vulnerabilities and aviation safety threats as topics, the topic
flow model. I use as gold standard existing manual workflows in both domains, reflected in the
existing data products by these organizations, and empirically evaluate if the automated models
can match or outperform existing manual practices. |