Universidad Complutense de Madrid
E-Prints Complutense

A Generalized Email Classification System for Workflow Analysis

Impacto

Downloads

Downloads per month over past year



Chaipornkaew, Piyanuch and Prexawanprasut, Takorn and Chang, Chia-Lin and McAleer, Michael (2017) A Generalized Email Classification System for Workflow Analysis. [ Documentos de trabajo del Instituto Complutense de Análisis Económico (ICAE); nº 21, 2017, ISSN: 2341-2356 ]

[img]
Preview
PDF
Creative Commons Attribution Non-commercial Share Alike.

558kB

URLURL Type
https://www.ucm.es/icaeOrganisation


Abstract

One of the most powerful internet communication channels is email. As employees and their clients communicate primarily via email, much crucial business data is conveyed via email content. Where businesses are understandably concerned, they need a sophisticated workflow management system to manage their transactions. A workflow management system should also be able to classify any incoming emails into suitable categories. Previous research has implemented a system to categorize emails based on the words found in email messages. Two parameters affected the accuracy of the program, namely the number of words in a database compared with sample emails, and an acceptable percentage for classifying emails. As the volume of email has become larger and more sophisticated, this research classifies email messages into a larger number of categories and changes a parameter that affects the accuracy of the program. The first parameter, namely the number of words in a database compared with sample emails, remains unchanged, while the second parameter is changed from an acceptable percentage to the number of matching words. The empirical results suggest that the number of words in a database compared with sample emails is 11, and the number of matching words to categorize emails is 7. When these settings are applied to categorize 12,465 emails, the accuracy of this experiment is approximately 65.3%. The optimal number of words that yields high accuracy levels lies between 11 and 13, while the number of matching words lies between 6 and 8.


Item Type:Working Paper or Technical Report
Uncontrolled Keywords:Email; business data; workflow management system; business transactions.
Subjects:Sciences > Mathematics > Operations research
Social sciences > Economics > Economic development
Social sciences > Economics > Labor
JEL:J24, O31, O32, O33
Series Name:Documentos de trabajo del Instituto Complutense de Análisis Económico (ICAE)
Volume:2017
Number:21
ID Code:44630
Deposited On:18 Sep 2017 11:16
Last Modified:18 Sep 2017 11:16

Origin of downloads

Repository Staff Only: item control page