Plagiarism Detection Software, OCR, and Urdu Research: Challenges and Possibilities

پلیجیریزم ڈیٹکشن سافٹ ویئر ، او سی آر اور اردو تحقیق: مسائل اور امکانات

Authors

  • DR. ZAHOOR AHMAD Secondary School Educator (Urdu), School Education Department, Punjab, Pakistan

DOI:

https://doi.org/10.52015/daryaft.v18i01.441

Keywords:

Urdu Linguistics, OCR, Plagiarism, Urdu Corpus, Ligatures, Khat-e-Nastaliq

Abstract

This article explores the relationship between technology and Urdu literary research, focusing on the challenges posed by plagiarism detection systems and Optical Character Recognition (OCR). Unlike English, a relatively ligature-free language, Urdu’s cursive script and complex ligatures create significant difficulties for OCR development. At present, the absence of a comprehensive Urdu corpus allows a degree of flexibility in plagiarism detection, as a large body of classical and handwritten (calligraphic) Urdu material is not yet available in editable digital formats. The study classifies existing PDF formats of Urdu texts and evaluates the limitations of current OCR tools, including vFlat, Dastaan, and OCR developed by the Center for Language Engineering (CLE), particularly in handling diverse fonts and traditional calligraphy (Khat-e-Nastaliq). The development of a universal Urdu OCR is essential for building a robust Urdu corpus. Although this would increase scrutiny through plagiarism detection software, it would ultimately enhance academic standards in Urdu research by encouraging originality, critical engagement, and reduced reliance on unverified textual reproduction.

Conflict of Interest: The author declares that there are no conflicts of interest related to the research, authorship, and/or publication of this article, and that the data presented have not been fabricated or falsified.

Funding: This research did not receive any specific grant or financial support from public, commercial, or not-for profit funding agencies.

Participant Consent: The author confirms that Informed consent was obtained from all participants, and confidentiality was duly maintained.

Downloads

Download data is not yet available.

Author Biography

DR. ZAHOOR AHMAD, Secondary School Educator (Urdu), School Education Department, Punjab, Pakistan

Dr. Zahoor Ahmad is a Secondary School Educator (Urdu) at the School Education Department, Punjab, Pakistan. He earned his PhD from Qurtuba University of Science and Information Technology Peshawar. His academic specialization is Urdu Linguistics, and he has one published research article to his credit.

Downloads

Published

30-06-2026

How to Cite

DR. ZAHOOR AHMAD. (2026). Plagiarism Detection Software, OCR, and Urdu Research: Challenges and Possibilities : پلیجیریزم ڈیٹکشن سافٹ ویئر ، او سی آر اور اردو تحقیق: مسائل اور امکانات. DARYAFT, 18(01), 114–124. https://doi.org/10.52015/daryaft.v18i01.441