摘要: |
Iowa DOT consumes data from multiple streams which is stored to assist in smart decision making. Table 1 gives an example of some traffic operations related data sources. In addition to these sources, DOT maintains a state-of-the-art crash repository. DOT also has access to very detailed weather data through Mesonet. The data archive of all these sources extends for past several years. The cumulative data size for past 5 years of data can easily be in the range of 15-20 terabytes.
Despite access to unprecedented amount of data, the decision makers are often restricted in their ability to explore these data sets. In general, pre-canned reports are serially produced from each of these individual sources of data and circulated to the decision maker without providing a comprehensive picture of the issue. Under the present set up, a simple query, such as, how many crashes happen during congested conditions can�t be answered easily and requires a dedicated research project. There are four main reasons for inability of decision makers to easily query mobility and safety trends: (a) Current data architecture restricts queries across data sources.
(b) Data manipulation is not distributed and hence takes a significant amount of time to come up for even a simple aggregate query, such as, average snowfall per county for a given year. (c) Lack of easy to visual or natural language based querying tool. It requires an expert to create complex programs to answer these simple questions, thus restricting decision makers to answering a few critical questions rather than having an ability to query the whole data base. (d) No automatic datamining is currently used to detect trends and anomalies. This implies that data is not being continuously mine to detect interesting trends automatically and thus the onus lies on the agency to reactively explore the data if the system crashes. |