On October 26, 2013, I was invited to speak at the Midwestern Conference of Asian Affairs on digital humanities and the work being done by MATRIX and WIDE. I described our methods of using graph and social network analytical metrics to extract rhetorical moves from discourse. I even got a chance to test our theories on a dataset that I used in my dissertation involving around 36,000 YouTube comments on a vide by MriRian.
I hope to discuss my findings of this experiment in more detail in a future post. Here, I wanted to express my gratitude for Professor Ethan Segal and MATRIX for allowing me to present on out work in WIDE-MATRIX’s Computational Rhetoric group, and to thank the assembled audience of Asian Studies scholars. The group asked keen questions about the unruly, noisy nature of texts, which always reminds me the rhetorical acts implicit in the normalization of data. The act of cleaning data is always a lossy activity, those who do text or data mining must not only be cognizant of the the data that we are screening out but also be as transparent as possible about what our biases and rationale are for these protocols. Methods carry with them methodologies, and these methodologies are inflected by ideological orientations that will condition the data.
The term heuristic was also used a lot in our discussions, and to me, that is one of the most valuable and focusing glosses for the work that we are doing in the computational rhetorics group. We are creating and applying critical lens to data in order to build theory. But these lens still require human tuning to arrive at plausible interpretations.