摘要: |
The recent boom in the availability and use of geolocation technologies has created a great need to understand datasets of trajectories. However, trajectories have several intrinsic attributes that make them difficult to analyze. First, their time-series nature makes applying traditional techniques challenging. Second, most datasets contain trajectories of many points, making for a high-dimensional modeling problem. Third, there are several competing notions of similarity/difference in trajectories. To deal with these challenges, this thesis proposes several methods using statistics and machine learning that provide a deep understanding of trajectory datasets. In particular, the thesis brings forth methods to perform anomaly detection and density estimation and to create spatial graphical models. A technique is presented for detecting anomalous trajectories in a dataset in an unsupervised fashion using support vector machines (SVMs) and various spatial representations of trajectories. The thesis also focuses on techniques for density estimation, that is providing a likelihood for each trajectory in a dataset. To effectively perform density estimation on trajectories, a combination of a Markovian assumption on the independence of the next position of a trajectory given its previous positions and kernel density estimation (KDE) is explored. Lastly, the thesis explores spatial graphical models. Undirected graphical models detail the conditional independence structure of a set of random variables. Given sparsity assumptions, this concept is used to build graphical models for indicator variables that have spatial locations associated with them, indicating if an agent has come near the corresponding location. Experiments were run using two real-world datasets: Automatic Identification System-tracked shipping vessels in the English Channel and every Atlantic Ocean tropical storm and hurricane track from 1949 to 2011. |