Invoice Classification Using Deep Features and Machine Learning Techniques

Ahmad S. Tarawneh, Ahmad B. Hassanat, D. Chetverikov, Imre Lendak, Chaman Verma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Invoices are issued by companies, banks and different organizations in different forms including handwritten and machine-printed ones; sometimes, receipts are included as a separated form of invoices. In current practice, normally, classifying these types is done manually, since each needs a special kind of processing such as making them suitable for optical character recognition systems (OCR). Classifying the invoices manually to different categories is a hard and time-consuming task. Therefore, we propose an automatic approach to classify invoices into three types: handwritten, machine-printed and receipts. The proposed method is based on extracting features using the deep convolutional neural network AlexNet. The features are classified using various machine learning algorithms, namely including Random Forests, K-nearest neighbors (KNN), and Naive Bayes. Different cross-validation approaches are applied in the experiments to ensure the effectiveness of the proposed solution. The best classification result was 98.4% (total accuracy), which was achieved by the KNN, such an almost perfect performance allows the proposed method to be used in practice as a preprocess for OCR systems, or as a standalone application.

Original languageEnglish
Title of host publication2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages855-859
Number of pages5
ISBN (Electronic)9781538679425
DOIs
Publication statusPublished - May 16 2019
Event2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Amman, Jordan
Duration: Apr 9 2019Apr 11 2019

Publication series

Name2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings

Conference

Conference2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019
CountryJordan
CityAmman
Period4/9/194/11/19

Fingerprint

Optical character recognition
Learning systems
Machine Learning
Character Recognition
Nearest Neighbor
Learning algorithms
Naive Bayes
Random Forest
Neural networks
Cross-validation
Learning Algorithm
Processing
Classify
Neural Networks
Industry
Experiments
Experiment
Machine learning
Form
K-nearest neighbor

Keywords

  • Deep features
  • Invoice classification
  • Machine learning
  • OCR

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Information Systems
  • Information Systems and Management
  • Electrical and Electronic Engineering
  • Computational Mathematics

Cite this

Tarawneh, A. S., Hassanat, A. B., Chetverikov, D., Lendak, I., & Verma, C. (2019). Invoice Classification Using Deep Features and Machine Learning Techniques. In 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings (pp. 855-859). [8717504] (2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/JEEIT.2019.8717504

Invoice Classification Using Deep Features and Machine Learning Techniques. / Tarawneh, Ahmad S.; Hassanat, Ahmad B.; Chetverikov, D.; Lendak, Imre; Verma, Chaman.

2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 855-859 8717504 (2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tarawneh, AS, Hassanat, AB, Chetverikov, D, Lendak, I & Verma, C 2019, Invoice Classification Using Deep Features and Machine Learning Techniques. in 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings., 8717504, 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 855-859, 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019, Amman, Jordan, 4/9/19. https://doi.org/10.1109/JEEIT.2019.8717504
Tarawneh AS, Hassanat AB, Chetverikov D, Lendak I, Verma C. Invoice Classification Using Deep Features and Machine Learning Techniques. In 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 855-859. 8717504. (2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings). https://doi.org/10.1109/JEEIT.2019.8717504
Tarawneh, Ahmad S. ; Hassanat, Ahmad B. ; Chetverikov, D. ; Lendak, Imre ; Verma, Chaman. / Invoice Classification Using Deep Features and Machine Learning Techniques. 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 855-859 (2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings).
@inproceedings{ef74675651cc4cd4bba69dc4fd1151c6,
title = "Invoice Classification Using Deep Features and Machine Learning Techniques",
abstract = "Invoices are issued by companies, banks and different organizations in different forms including handwritten and machine-printed ones; sometimes, receipts are included as a separated form of invoices. In current practice, normally, classifying these types is done manually, since each needs a special kind of processing such as making them suitable for optical character recognition systems (OCR). Classifying the invoices manually to different categories is a hard and time-consuming task. Therefore, we propose an automatic approach to classify invoices into three types: handwritten, machine-printed and receipts. The proposed method is based on extracting features using the deep convolutional neural network AlexNet. The features are classified using various machine learning algorithms, namely including Random Forests, K-nearest neighbors (KNN), and Naive Bayes. Different cross-validation approaches are applied in the experiments to ensure the effectiveness of the proposed solution. The best classification result was 98.4{\%} (total accuracy), which was achieved by the KNN, such an almost perfect performance allows the proposed method to be used in practice as a preprocess for OCR systems, or as a standalone application.",
keywords = "Deep features, Invoice classification, Machine learning, OCR",
author = "Tarawneh, {Ahmad S.} and Hassanat, {Ahmad B.} and D. Chetverikov and Imre Lendak and Chaman Verma",
year = "2019",
month = "5",
day = "16",
doi = "10.1109/JEEIT.2019.8717504",
language = "English",
series = "2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "855--859",
booktitle = "2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings",

}

TY - GEN

T1 - Invoice Classification Using Deep Features and Machine Learning Techniques

AU - Tarawneh, Ahmad S.

AU - Hassanat, Ahmad B.

AU - Chetverikov, D.

AU - Lendak, Imre

AU - Verma, Chaman

PY - 2019/5/16

Y1 - 2019/5/16

N2 - Invoices are issued by companies, banks and different organizations in different forms including handwritten and machine-printed ones; sometimes, receipts are included as a separated form of invoices. In current practice, normally, classifying these types is done manually, since each needs a special kind of processing such as making them suitable for optical character recognition systems (OCR). Classifying the invoices manually to different categories is a hard and time-consuming task. Therefore, we propose an automatic approach to classify invoices into three types: handwritten, machine-printed and receipts. The proposed method is based on extracting features using the deep convolutional neural network AlexNet. The features are classified using various machine learning algorithms, namely including Random Forests, K-nearest neighbors (KNN), and Naive Bayes. Different cross-validation approaches are applied in the experiments to ensure the effectiveness of the proposed solution. The best classification result was 98.4% (total accuracy), which was achieved by the KNN, such an almost perfect performance allows the proposed method to be used in practice as a preprocess for OCR systems, or as a standalone application.

AB - Invoices are issued by companies, banks and different organizations in different forms including handwritten and machine-printed ones; sometimes, receipts are included as a separated form of invoices. In current practice, normally, classifying these types is done manually, since each needs a special kind of processing such as making them suitable for optical character recognition systems (OCR). Classifying the invoices manually to different categories is a hard and time-consuming task. Therefore, we propose an automatic approach to classify invoices into three types: handwritten, machine-printed and receipts. The proposed method is based on extracting features using the deep convolutional neural network AlexNet. The features are classified using various machine learning algorithms, namely including Random Forests, K-nearest neighbors (KNN), and Naive Bayes. Different cross-validation approaches are applied in the experiments to ensure the effectiveness of the proposed solution. The best classification result was 98.4% (total accuracy), which was achieved by the KNN, such an almost perfect performance allows the proposed method to be used in practice as a preprocess for OCR systems, or as a standalone application.

KW - Deep features

KW - Invoice classification

KW - Machine learning

KW - OCR

UR - http://www.scopus.com/inward/record.url?scp=85067106417&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067106417&partnerID=8YFLogxK

U2 - 10.1109/JEEIT.2019.8717504

DO - 10.1109/JEEIT.2019.8717504

M3 - Conference contribution

AN - SCOPUS:85067106417

T3 - 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings

SP - 855

EP - 859

BT - 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -