DycomDetector

Schematic overview of DycomDetector

DycomDetector is a novel approach for topic modeling in text corpora. Our algorithm extracts and classifies the keywords, calculates relationships based on keyword co-ocurrences, constructs the networks at different time points, applies a graph partitioning algorithm to extract latent communities. The intuitive interface of our system supports various interactive features, such as lensing and filtering by the sudden increase in term frequency, vertex degree, betweenness centrality, etc. It also allows the users to search for a topic of interest and visualize its temporal relations with other detected communities. DycomDetector schema

To enable users to explore the vast temporal text corpora in an efficient way, DycomDetector adopts following steps (as depicted in the above figure):

Extract and classify terms: The text documents are preprocessed into entities, ranked by frequencies, and further classified into different categories: people, places, organizations, and miscellaneous.
Construct networks of collocated terms: This step constructs the relations of terms/phrases based on their co-occurences in the same political blogs. At each time point, we obtain a network snapshot of important terms.
Refine the network snapshots: In the above example, we filter vertices with frequency more than 6. Louvain method is applied on the refined networks automatically to detect political events (detected communities highlighted in the gray background in the last panel).
Generate visualization and interactions: Each sub-network (at each time point) is represented as a thumbnail which summarizes its network structure. Consecutive thumbnails can be expanded on mouse over. Modularity histograms, text clouds, arcs diagrams, and small multiples provide supplementary views of these sub-networks.

Here are some important political events (term communities) detected in the above example:

Box A: The Sandy Hook Elementary School shooting occurred on December 14, 2012, in Newtown, Connecticut, when 20-year-old Adam Lanza fatally shot 20 children. Sometime before 9:30 am, Lanza shot and killed his mother Nancy Lanza at their Newtown home with a .22-caliber Savage rifle. Adam Lanza then drove to Sandy Hook Elementary School in his mother's car. The National Rifle Association denied that Adam Lanza or Nancy Lanza were members. On December 21, 2012, the National Rifle Association 's Wayne LaPierre said gun-free school zones attract killers and that another gun ban would not protect Americans.
Box B: The Boston Marathon bombing occurred on April 15, 2013. Two homemade bombs detonated near the finish line of the annual Boston Marathon, killing three people and injuring several hundred others. FBI later identified the bombers: Dzhokhar Tsarnaev and Tamerlan Tsarnaev. The Kyrgyz-American brothers killed a MIT policeman, kidnapped a man in his car, and had a shootout with the police in nearby Watertown. They never lived in Chechnya, yet the brothers identified themselves as Chechen.
Box C: On June 21, 2013, the U.S. Department of Justice unsealed charges against Edward Snowden. The FISA Amendments Act of 2008, an Act of Congress that amended the Foreign Intelligence Surveillance Act, has been used as the legal basis for surveillance programs disclosed by Edward Snowden, including PRISM. Edward Snowden is a former CIA employee who leaked classified information from the National Security Agency (NSA) in 2013 without authorization. NSA Director Keith Alexander initially estimated that Snowden had copied anywhere from 50,000 to 200,000 documents.
On May 20, 2013, Snowden flew to Hong Kong before revealing the most significant leak in U.S. history by Pentagon Papers. Director of National Intelligence James Clapper said Snowden's leaks created a "perfect storm". Edward Snowden also revealed that Verizon had been providing the NSA with its customers' phone records [link].
Box D: The Shooting of Trayvon Martin. On July 13, 2013, the jury found George Zimmerman not guilty of Trayvon Martin's murder: On the night of February 26, 2012, in Sanford, Florida, George Zimmerman fatally shot Trayvon Martin, a 17-year-old African American high school student. The police chief said that there was no evidence to refute Zimmerman 's claim of having acted in self-defense, and that under Florida's Stand Your Ground statute, the police were prohibited by law from making an arrest. The police chief also said that Zimmerman had a right to defend himself with lethal force.

ComModeler: Topic Modeling Using Community Detection

EuroVA 2018, Tommy Dang and Vinh T. Nguyen

Schematic overview of DycomDetector