Publication:
A Generalized Email Classification System for Workflow Analysis

Loading...
Thumbnail Image
Official URL
Full text at PDC
Publication Date
2017
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Facultad de CC Económicas y Empresariales. Instituto Complutense de Análisis Económico (ICAE)
Citations
Google Scholar
Research Projects
Organizational Units
Journal Issue
Abstract
One of the most powerful internet communication channels is email. As employees and their clients communicate primarily via email, much crucial business data is conveyed via email content. Where businesses are understandably concerned, they need a sophisticated workflow management system to manage their transactions. A workflow management system should also be able to classify any incoming emails into suitable categories. Previous research has implemented a system to categorize emails based on the words found in email messages. Two parameters affected the accuracy of the program, namely the number of words in a database compared with sample emails, and an acceptable percentage for classifying emails. As the volume of email has become larger and more sophisticated, this research classifies email messages into a larger number of categories and changes a parameter that affects the accuracy of the program. The first parameter, namely the number of words in a database compared with sample emails, remains unchanged, while the second parameter is changed from an acceptable percentage to the number of matching words. The empirical results suggest that the number of words in a database compared with sample emails is 11, and the number of matching words to categorize emails is 7. When these settings are applied to categorize 12,465 emails, the accuracy of this experiment is approximately 65.3%. The optimal number of words that yields high accuracy levels lies between 11 and 13, while the number of matching words lies between 6 and 8.
Description
Keywords
Citation
[1] Mihajlo, G., Halawi, G., Karnin, Z., and Maarek, Y. (2014), How Many Folders Do You Really Need? Classifying Email into a Handful of Categories, Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, November 2014, pp. 869-878. [2] Nenkova, A., and Bagga, A. (2003), Email Classification for Contact Centers, Proceedings of the 2003 ACM Symposium on Applied Computing, March 2003, pp. 789-792. [3] Taliby, R., Dean, R., Milner, B., and Smith, D. (2006), Email Classification for Automated Service Handling, Proceedings of the 2006 ACM Symposium on Applied Computing, April 2006, pp. 1073-1077. [4] Yelupula, K., and Ramaswamy, S. (2008), Social Network Analysis for Email Classification, Proceedings of the 46th Annual Southeast Regional Conference on XX, March 2008, pp. 469-474. [5] Aery, M., and Chakravarthy, S. (2004), EMailShift: Mining-based Approaches to Email Classification, Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 2004, pp. 580-581. [6] Tam, T., Ferreira, A., and Lourenco, A. (2012), Automatic Foldering of Email Messages: A Combination Approach, Proceedings of the 34th European Conference on Advances in Information Retrieval, March 2012, pp. 232-243. [7] Kiritchenko, S., and Matwin, S. (2001), Email Classification with Co-training, Proceedings of the 2001 Conference of the Centre for Advanced Studies on Collaborative Research, October 2001, pp. 1-10. [8] Kiritchenko, S., and Matwin, S. (2011), Email Classification with Co-training, Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research, November 2011, pp. 301-312. [9] Chan, J., Koprinska, I., and Poon, J. (2004), Co-training with a Single Natural Feature Set Applied to Email Classification, Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence, September 2004, pp. 586-589. [10] Yoo, S., Yang, Y., and Carbonell, J. (2011), Modeling Personalized Email Prioritization: Classification-based and Regression-based Approaches, Proceedings of the 20th ACM International Conference on Information and Knowledge Management, October 2011, pp. 729-738. [11] Alsmadia, I. and Alhamib, I. (2015), Clustering and Classification of Email Contents, Journal of King Saud University - Computer and Information Sciences, 27(1), 46–57. [12] Katakis, I., Tsoumakas, G., and Vlahavas I. (2006), E-mail Mining: Emerging Techniques for EMail Management, Web Data Management Practices: Emerging Techniques and Technologies, Idea Group Publishing, 2006, 220-243. [13] Ayodele, T., Khusainov, R., and Ndzi, D. (2007), Email Classification and Summarization: A Machine Learning Approach, IET Conference on Wireless, Mobile and Sensor Networks (CCWMSN07). 2007, pp. 805-808. [14] Ayodele, T., Zhou, S., and Khusainov, R. (2009), Email Grouping and Summarization: An Unsupervised Learning Technique, WRI World Congress on Computer Science and Information Engineering, 2009, pp. 575-579. [15] Kushmerick, N. and Lau, T. (2005). Automated Email Activity Management: An Unsupervised Learning Approach, Proceedings of the 2005 International Conference on Intelligent User Interfaces, 2005, pp. 67-74. [16] Schuff, D., Turetken, O., D'Arcy, J., and Croson, D. (2007), Managing E-Mail Overload: Solutions and Future Challenges, Computer, 40(2), 31-36. [17] Prexawanprasut, T. and Chaipornkaew, P. (2017), Email Classification Model for Workflow Management Systems, Walailak Journal of Science and Technology, 14(10), 783-790. [18] Chaipornkaew, P., Prexawanprasut, T., and McAleer, M. (2017), You’ve Got Email: A Workflow Management Extraction System, Journal of Reviews on Global Economics, 6, 342-349.