is a task in natural language processing and machine learning where documents are assigned to one or more predefined categories or classes based on their content. It involves training a machine learning model on a labeled Dataset where each document is associated with one or more categories.
The trained model learns patterns and features from the text data to predict the appropriate category for unseen documents. Common techniques used for document classification include SL algorithms such as Naive Bayes, SVM, and NN.
Applications of document classification include Spam Email Detection, Topic Categorization of news articles, and Document Routing in information retrieval systems.