An Improved mT5 Model for Chinese Text Summary Generation


Fuping Ren2, Jian Chen1, and Defu Zhang1, 1Xiamen University, China, 2Shenzhen Comtech Technology Co. Ltd, China


Understanding complex policy documents can be challenging, highlighting the need for intelligent interpretation of Chinese policies. To enhance Chinese text summarization, this study utilized the mT5 model as the core framework and initial weights. Additionally, it reduced model size through parameter clipping, employed the Gap Sentence Generation (GSG) method as an unsupervised technique, and enhanced the Chinese tokenizer. After training on a meticulously processed 30GB Chinese training corpus, the study developed the enhanced mT5-GSG model. When fine-tuning on Chinese policy texts, it adopted the "Dropout Twice" approach and ingeniously merged the probability distribution of the two dropouts using the Wasserstein distance. Experimental results indicate that the proposed model achieved Rouge-1, Rouge-2, and Rouge-L scores of 56.13%, 45.76%, and 56.41% respectively on the Chinese policy text summarization dataset.


Natural Language Processing, Text Summarization, Transformer model

Full Text  Volume 14, Number 2