Huidan Liu 2019-02-19T15:02:30+00:00

刘汇丹(Huidan Liu)

Phone:   +86-10-82661800-1219

Fax:         +86-10-82661800

Email:    huidan@iscas.ac.cn

Links:   /team/liuhuidan/

BIOGRAPHY

I am an Associate Professor (from 2015.03) of Computer Science in the IR Laboratory at the Institute of Software, Chinese Academy of Sciences.

I received my PhD degree in Computer Software and Theory from Institute of Software, Chinese Academy of Sciences under the Supervion of Professor Jian Wu and Yeping He in Jan, 2013. My Ph.D Thesis deals with Tibetan Natural Language Processing.

I have made long term research on Chinese information processing and Multilingual information processing, such as computer encoding detection and conversion, input method, Tibetan word segmentation and Part-of-Speech tagging, text mining and Sino-Tibetan machine translation. I have applied 10 patents with  partners and 6 of them are granted.

RESEACH INTERESTS

  • Chinese information processing.
  • Multilingual information processing.
  • Machine translation.
  • Information extraction.

SELECTED PUBLICATIONS

出版专著                                                                   

  1. 龙从军,刘汇丹.藏文自动分词的理论与方法研究[M].北京:知识产权出版社. 2016年3月.

学术论文                                                                   

2016-2018                                                               

  1. 龙从军,豆格才让,刘汇丹.汉—藏人名用字音译规则研究[J].中文信息学报,2018,32(03):71-76.
  2. 刘汇丹,洪锦玲,诺明花,吴健.基于大规模网络语料的藏文音节拼写错误统计与分析[J].中文信息学报,2017,31(02):61-70.
  3. 李博涵,刘汇丹,龙从军,吴健.基于深度学习的藏文分词方法[J].计算机工程与设计,2018,39(01):194-198.
  4. 龙从军,刘汇丹,吴健.藏语音节标注研究[J].中文信息学报,2017,31(04):89-93+99.
  5. Huidan Liu, Weina Zhao, Minghua Nuo, Jinling Hong, Xin Yu, Jian Wu. A Chinese to Tibetan Machine Translation System with Multiple Translating Strategies[J]. Himalayan Linguistics,2016,15(1):149-166.
  6. Weina Zhao, Lin Li, Huidan Liu, Jian Wu. Tibetan Trisyllabic Light Verb Construction Recognition [J]. Himalayan Linguistics,2016,15(1):137-148.
  7. 李博涵, 刘汇丹, 龙从军,吴健. 深度学习在汉藏机器翻译中的应用研究[C]. 第12届全国机器翻译研讨会论文集,2016:54-60.
  8. 龙从军, 刘汇丹, 安波, 才华,吴健. 藏文编码字符集标准应用中的问题及对策[J]. 信息技术与标准化,2016,(1-2):46-51.

2015                                                                   

  1. Huidan Liu, Congjun Long, Minghua Nuo, Jian Wu. Tibetan Word Segmentation as Sub-syllable Tagging with Syllable’s Part-of-Speech Property. LNAI 9427. 2015. Springer. (EI)
  2. Minghua Nuo, Huidan Liu, Congjun Long, Jian Wu. Tibetan Unknown Word Identification from News Corpora for Supporting Lexicon-based Tibetan Word Segmentation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural language Processing (Short Paper): 451-457, Beijing, China, 2015. (EI)
  3. Huidan Liu, Minghua Nuo, Jian Wu. Zipf’s Law and Statistical Data on Modern Tibetan. The 25th International Conference on Computational Linguistics (COLING 2014):322-333. (EI)
  4. 刘汇丹, 诺明花, 马龙龙, 吴健, 贺也平.Web 藏文文本资源挖掘与利用研究[J]. 中文信息学报,2015,29(1):170-177.
  5. 赵维纳,李琳,刘汇丹等.藏语三音动词短语自动抽取研究[J].中文信息学报,2015,29(3):196-200.DOI:10.3969/j.issn.1003-0077.2015.03.027.
  6. 龙从军,刘汇丹,诺明花等.基于藏语字性标注的词性预测研究[J].中文信息学报,2015,29(5):211-215.DOI:10.3969/j.issn.1003-0077.2015.05.027.
  7. 安波, 诺明花,吴健, 刘汇丹, 马龙龙, 传统蒙古文“同形不同码”问题研究[J] . 信息技术与标准化, 2015,(1-2):62-66.

2013-2014                                                             

  1. 刘汇丹, 诺明花, 吴健.基于大规模网络语料的藏文音节拼写错误统计与分析[C].//第十三届全国计算语言学学术会议.2014.
  2. 王震,刘汇丹,吴健等.新标准体系下蒙古文变形显现模型的设计与实现[J].中文信息学报,2013,27(1):108-114.DOI:10.3969/j.issn.1003-0077.2013.01.015.
  3. 赵维纳,于新,刘汇丹等.现代藏语助动词结尾句子边界识别方法[J].中文信息学报,2013,27(1):115-119.DOI:10.3969/j.issn.1003-0077.2013.01.016.
  4. 熊维,吴健,刘汇丹等.基于短语串实例的汉藏辅助翻译[J].中文信息学报,2013,27(3):84-90.DOI:10.3969/j.issn.1003-0077.2013.03.011.
  5. 诺明花,刘汇丹,马龙龙等.基于中心语块扩展的汉藏基本名词短语对的识别[J].中文信息学报,2013,27(4):63-69.DOI:10.3969/j.issn.1003-0077.2013.04.010.

2012                                                                  

  1. Huidan Liu, Minghua Nuo, Jian Wu and Yeping He. Building Large Scale Text Corpus for Tibetan Natural Language Processing by Extracting Text from Web Pages. The 10th Workshop on Asian Language Resources at COLING 2012:11-20.
  2. Minghua Nuo, Huidan Liu, Weina Zhao, Longlong Ma, Jian Wu, Zhiming Ding. Tibetan base Noun Phrase Identification framework based on Chinese-Tibetan sentence aligned corpus. In Proceedings of the 24th International Conference on Computational Linguistics (COLING2012):2141-2158. (EI)
  3. 刘汇丹, 诺明花, 赵维纳, 吴健, 贺也平. SegT:一个实用的藏文分词系统[J]. 中文信息学报,2012,26(1):97-103.
  4. 刘汇丹, 诺明花, 马龙龙, 吴健, 贺也平. 通用藏文搜索引擎关键技术研究[C].//第四届全国少数民族青年自然语言信息处理论文集.2012:162-166.
  5. 吴健, 刘汇丹. 基于词语消歧的分层次汉字简繁转换系统[J]. 中国语言战略,2012,1(1):25-35.
  6. 诺明花, 刘汇丹, 吴健, 丁治明. 基于关联度的汉藏多词单元等价对抽取方法[J]. 中文信息学报.2012,26(3):98-103.
  7. 诺明花, 刘汇丹, 马龙龙, 吴健, 丁治明.基于中心语块扩展的汉藏基本名词短语对识别. 第六届全国计算语言学研讨会(YCCL),2012:194-200.
  8. 于新, 张立强, 刘汇丹等. 大规模汉藏双语语料库构建机制[C].//第四届全国少数民族青年自然语言信息处理论文集.2012:37-42.

2011                                                                  

  1. Huidan Liu, Minghua Nuo, Longlong Ma, Jian Wu and Yeping He. Tibetan Word Segmentation as Syllable Tagging Using Conditional Random Fields. In Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation (PACLIC-2011):168-177. (EI)
  2. Huidan Liu, Minghua Nuo, Longlong Ma, Jian Wu and Yeping He. Compression Methods by Code Mapping and Code Dividing for Chinese Dictionary Stored in a Double-Array Trie. Proceedings of the Fifth International Joint Conference on Natural Language Processing (IJCNLP-2011):1189-1197.
  3. Minghua Nuo, Huidan Liu, Longlong Ma, Jian Wu, Zhiming Ding.Automatic Acquisition of Chinese-Tibetan Multi-Word Equivalent Pair from Bilingual Corpora. In the Proceedings of the International Conference on Asian Language Processing (IALP2011):177-180. (EI)
  4. Long-long Ma, Huidan Liu, Jian Wu. MRG-OHTC Database for Online Handwritten Tibetan Character Recognition. ICDAR 2011: 207-211. (EI)
  5. 江荻, 刘汇丹, 吴兵等. 国际音标输入软件的设计与实现[J]. 中文信息学报,2011, 25(2): 111-116.
  6. 诺明花, 刘汇丹, 马龙龙, 吴健. 汉藏短语抽取算法研究. 第十三届少数民族语言信息处理学术研讨会.2011:50-57.
  7. 诺明花, 吴健, 刘汇丹等. 汉藏短语对抽取中短语译文获取方法研究[J]. 中文信息学报,2011,25(3):112-117.
  8. 诺明花, 张立强, 刘汇丹等. 汉藏短语抽取[J]. 中文信息学报,2011,25(2):105-110,121.
  9. 熊维, 王震, 于新, 刘汇丹等.ISCAS 机器翻译系统与评测技术报告[C].//第七届全国机器翻译研讨会论文集.2011:155-161.

2006-2010                                                               

  1. Huidan LIU, Weina ZHAO, Minghua NUO, Li JIANG, Jian WU, Yeping HE.Tibetan Number Identification Based on Classification of Number Components in Tibetan Word Segmentation. In Proceedings of the 23rd International Conference on Computational Linguistics – poster volume (COLING 2010):719-724. (EI)
  2. Weina Zhao, Xin Yu, Huidan Liu. Sentence Boundary Detection Based on Auxiliary Verbs in Modern Tibetan. Conference on Language Investigation and Information Processing (LIIP2010).
  3. 刘汇丹, 诺明花, 赵维纳, 吴健, 贺也平. 藏文编码转换软件” 藏码通” 的设计与实现[C].//第三届全国少数民族青年自然语言信息处理、第二届全国多语言知识库建设联合学术研讨会论文集.2010:217-221.
  4. 刘汇丹, 芮建武, 吴健. 基于Qt的国际化图形用户界面设计与实现[J]中文信息学报,2006,20(4):94-99.
  5. 刘汇丹, 芮建武, 吴健.藏文网页的编码识别与转换[C].//中文信息处理前沿进展:中国中文信息学会二十五周年学术会议论文集.2006:573-580.
  6. 赵维纳, 刘汇丹, 于新等. 基于法律文本的藏语句子边界识别[C].//第五届全国青年计算语言学研讨会(YWCL 2010) 论文集.2010:480-486.
  7. 赵维纳, 刘汇丹, 于新等. 面向汉藏辅助翻译系统的平行语料库建设[C].//第三届全国少数民族青年自然语言信息处理、第二届全国多语言知识库建设联合学术研讨会论文集.2010:43-46.
  8. 诺明花, 张立强, 刘汇丹等. 汉藏短语抽取[C].//第五届全国青年计算语言学研讨会(YWCL 2010) 论文集.2010:303-309.