Replication Data for: Total-Text: Towards Orientation Robustness in Scene Text Detection

Version 1.0

Chan, Chee Seng, 2023, "Replication Data for: Total-Text: Towards Orientation Robustness in Scene Text Detection", https://doi.org/10.22452/RD/PDABNZ, Universiti Malaya Research Data Repository, V1

Learn about Data Citation Standards.

Contact Owner

Dataset Metrics

666 Downloads

Description	The Total-Text dataset is a collection of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind. There are two(2) zipped files associated with the dataset: a) Train - It contains 1255 images. b) Test - It contains 300 images.
Subject	Computer and Information Science
Keyword	Curved text, Scene text detection
Related Publication	Ch’ng, CK., Chan, C.S. & Liu, CL. Total-Text: toward orientation robustness in scene text detection. IJDAR 23, 31–52 (2020). doi: 10.1007/s10032-019-00334-z
Notes	Total-Text is a word-level based English curve text dataset. If you are interested in text-line based dataset with both English and Chinese instances, we highly recommend you to refer SCUT-CTW1500 (https://github.com/Yuliang-Liu/Curve-Text-Detector). In addition, a Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT - http://rrc.cvc.uab.es/?ch=14), which is extended from Total-Text and SCUT-CTW1500, was held at ICDAR2019 to stimulate more innovative ideas on the arbitrary-shaped text reading task. Congratulations to all winners and challengers. The technical report of ArT can be found on at https://arxiv.org/abs/1909.07145. Total-Text and SCUT-CTW1500 are now part of the training set of the largest curved text dataset - ArT (Arbitrary-Shaped Text dataset). In order to retain the validity of future benchmarking on Total-Text datasets, the test-set images of Total-Text should be removed from the ArT dataset shall one intend to leverage the extra training data from the ArT dataset. We count on the trust of the research community to perform such removal operation to attain the fairness of the benchmarking.
License/Data Use Agreement	CC0 1.0

Filter by

	1 to 6 of 6 Files	Download
	collage.png PNG Image - 7.9 MB Published May 15, 2023 190 Downloads MD5: f93c7d17568b1673352f75adf5600ea1	Access File File Access Public Download Options PNG Image Download Metadata Data File Citation EndNote XML RIS BibTeX
	groundtruth_pixel_clm.tar.gz Gzip Archive - 76.0 MB Published May 15, 2023 8 Downloads MD5: f2d6db193be9033bf54224ea44a72cc0 The pixel level mask groundtruth of Total-Text dataset	Access File File Access Public Download Options Gzip Archive Download Metadata Data File Citation EndNote XML RIS BibTeX
	groundtruth_pixel_trm.tar.gz Gzip Archive - 5.4 MB Published May 15, 2023 218 Downloads MD5: 0d0ce35e77f580438683dac4a0df10c7 The text region mask groundtruth of Total-Text dataset	Access File File Access Public Download Options Gzip Archive Download Metadata Data File Citation EndNote XML RIS BibTeX
	groundtruth_text.tar.gz Gzip Archive - 2.5 MB Published May 15, 2023 228 Downloads MD5: 4b26c65f89bc8b54f791be2f4caf11f3 The groundtruth of Total-Text dataset	Access File File Access Public Download Options Gzip Archive Download Metadata Data File Citation EndNote XML RIS BibTeX
	test.tar.gz Gzip Archive - 100.9 MB Published May 15, 2023 13 Downloads MD5: 1063e6ae79c990cf6fcd255233bf938b An archive comprises 300 testing images	Access File File Access Public Download Options Gzip Archive Download Metadata Data File Citation EndNote XML RIS BibTeX
	train.tar.gz Gzip Archive - 311.5 MB Published May 15, 2023 9 Downloads MD5: 5fd328598da9bc21a2c2efef4890dfd4 An archive comprises 1255 training images	Access File File Access Public Download Options Gzip Archive Download Metadata Data File Citation EndNote XML RIS BibTeX

Citation Metadata

Persistent Identifier	doi:10.22452/RD/PDABNZ
Publication Date	2023-05-15
Title	Replication Data for: Total-Text: Towards Orientation Robustness in Scene Text Detection
Author	Chan, Chee Seng (Universiti Malaya) - ORCID: 0000-0001-7677-2865
Point of Contact	Use email button above to contact. Chan, Chee Seng (Universiti Malaya)
Description	The Total-Text dataset is a collection of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind. There are two(2) zipped files associated with the dataset: a) Train - It contains 1255 images. b) Test - It contains 300 images.
Subject	Computer and Information Science
Keyword	Curved text Scene text detection
Related Publication	Ch’ng, CK., Chan, C.S. & Liu, CL. Total-Text: toward orientation robustness in scene text detection. IJDAR 23, 31–52 (2020). doi: 10.1007/s10032-019-00334-z
Notes	Total-Text is a word-level based English curve text dataset. If you are interested in text-line based dataset with both English and Chinese instances, we highly recommend you to refer SCUT-CTW1500 (https://github.com/Yuliang-Liu/Curve-Text-Detector). In addition, a Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT - http://rrc.cvc.uab.es/?ch=14), which is extended from Total-Text and SCUT-CTW1500, was held at ICDAR2019 to stimulate more innovative ideas on the arbitrary-shaped text reading task. Congratulations to all winners and challengers. The technical report of ArT can be found on at https://arxiv.org/abs/1909.07145. Total-Text and SCUT-CTW1500 are now part of the training set of the largest curved text dataset - ArT (Arbitrary-Shaped Text dataset). In order to retain the validity of future benchmarking on Total-Text datasets, the test-set images of Total-Text should be removed from the ArT dataset shall one intend to leverage the extra training data from the ArT dataset. We count on the trust of the research community to perform such removal operation to attain the fairness of the benchmarking.
Depositor	Liew, Chee Sun
Deposit Date	2023-05-14

Dataset Terms

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons CC0 1.0 Universal Public Domain Dedication. CC0 1.0

Dataset Version	Summary	Contributors	Published on
No records found.

Edit File

This file has already been deleted (or replaced) in the current version. It may not be edited.

Restrict Access

Restricting limits access to published files. People who want to use the restricted files can request access by default. If you disable request access, you must add information about access to the Terms of Access field.

Learn about restricting files and dataset access in the User Guide.

Request Access

Enable access request

You must enable request access or add terms of access to restrict file access.

Terms of Access for Restricted Files

Save Changes

Edit Embargo

The selected file or files have already been published. Contact an administrator to change the embargo date or reason of the file or files.

Delete Files

The file will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Select File(s)

Please select one or more files.

Share Dataset

Share this dataset on your favorite social media networks.

Dataset Citations

Citations for this dataset are retrieved from Crossref via DataCite using Make Data Count standards. For more information about dataset metrics, please refer to the User Guide.

Sorry, no citations were found.

Restricted Files Selected

The selected file(s) may not be downloaded because you have not been granted access.

You may request access to the restricted file(s) by clicking the Request Access button.

Download Options

The files selected are too large to download as a ZIP.

You can select individual files that are below the 100.0 MB download limit from the files table, or use the Data Access API for programmatic access to the files.

Select File(s)

Please select a file or files to be downloaded.

Restricted Files Selected

The restricted file(s) selected may not be downloaded because you have not been granted access.

Click Continue to download the files you have access to download.

Delete Dataset

Are you sure you want to delete this dataset and all of its files? You cannot undelete this dataset.

Delete Draft Version

Are you sure you want to delete this draft version? Files will be reverted to the most recently published version. You cannot undelete this draft.

Unpublished Dataset Private URL

Private URL can only be used with unpublished versions of datasets.

Unpublished Dataset Private URL

Are you sure you want to disable the Private URL? If you have shared the Private URL with others they will no longer be able to use it to access your unpublished dataset.

Delete Files

The file(s) will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Compute

This dataset contains restricted files you may not compute on because you have not been granted access.

Deaccession Dataset

Are you sure you want to deaccession? The selected version(s) will no longer be viewable by the public.

Deaccession Dataset

Are you sure you want to deaccession this dataset? It will no longer be viewable by the public.

Version Differences Details

Please select two versions to view the differences.

Version Differences Details

Version:
Last Updated:

Select File(s)

Please select a file or files for access request.

Select File(s)

Embargoed files cannot be accessed. Please select an unembargoed file or files for your access request.

Edit Tags

Select existing file tags or create new tags to describe your files. Each file can have more than one tag.

Request Access

You need to Log In to request access.

Dataset Terms

This dataset is made available under the following terms. Please confirm and/or complete the information needed below in order to continue.

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons CC0 1.0 Universal Public Domain Dedication. CC0 1.0

Preview Guestbook

Upon downloading files the guestbook asks for the following information.

Guestbook Name

Collected Data

Account Information

Package File Download

Use the Download URL in a Wget command or a download manager to download this package file. Download via web browser is not recommended. User Guide - Downloading a Dataverse Package via URL

Download URL

https://researchdata.um.edu.my/api/access/datafile/

Request Access

Please confirm and/or complete the information needed below in order to request access to files in this dataset.

Compute Batch

Clear Batch

Dataset	Persistent Identifier	Change Compute Batch

Compute Batch

Submit for Review

You will not be able to make changes to this dataset while it is in review.

Publish Dataset

Are you sure you want to republish this dataset?

Select if this is a minor or major version update.

Minor Release (1.1)

Major Release (2.0)

Publish Dataset

This dataset cannot be published until Faculty of Computer Science & Information Technology is published by its administrator.

Publish Dataset

This dataset cannot be published until Faculty of Computer Science & Information Technology and Universiti Malaya Research Data Repository are published.

Return to Author

Return this dataset to contributor for modification.