Hello @lucaskjaero,
I have a project similar to yours where I've implemented some Chinese character recognition models using the CASIA data sets. For my project, I've similarly used the CASIA competition GNT files, but I believe it should be easier to build performant models on the HWDB1.X and OLHWDB1.X data sets because they are five times larger. Unfortunately, those data sets use a different file format MPF. Do you have any idea how to process these files using Python?
Datasets:
http://www.nlpr.ia.ac.cn/databases/handwriting/Download.html
My Project:
https://github.com/brucegarro/chinese-character-recognition
Hello @lucaskjaero,
I have a project similar to yours where I've implemented some Chinese character recognition models using the CASIA data sets. For my project, I've similarly used the CASIA competition GNT files, but I believe it should be easier to build performant models on the HWDB1.X and OLHWDB1.X data sets because they are five times larger. Unfortunately, those data sets use a different file format MPF. Do you have any idea how to process these files using Python?
Datasets:
http://www.nlpr.ia.ac.cn/databases/handwriting/Download.html
My Project:
https://github.com/brucegarro/chinese-character-recognition