OTOMATİK METİN ÖZETLEME İÇİN GENETİK ALGORİTMA TABANLI CÜMLE ÇIKARIMI

İnternetin gelişmesiyle beraber dijital ortamda bulunan veri miktarı sürekli artış göstermektedir. Özellikle web 2.0 teknolojisiyle birlikte wikipedia, blog, sosyal medya gibi, kullanıcıların yeni içerik ekleyebildiği sitelerin artması sonucunda internet ortamındaki bilgi miktarının hem sayısı hem de büyüklüğü sürekli artarak devasa boyutlara ulaşmıştır. Verilerin bu kadar çok olduğu bir ortamda istenilen bilgiye ulaşmak ciddi bir problemdir.  Günümüz bilgi çağı, aranan bilgiye daha çabuk ve hızlı erişmek için otomatik metin özetleme sitemlerinin bilgi çıkarımı ile ilgili birçok alanda kullanımını zorunlu hale getirmektedir.  Bu çalışmada cümle çıkarımına dayalı metin özetleme yöntemleri ele alınmış, ilk olarak doküman içerisinde yer alan cümleleri temsil edecek öznitelikler çıkarılmış,  ardından bu

GENETIC ALGORITHM BASED SENTENCE EXTRACTION FOR AUTOMATIC TEXT SUMMARIZATION

With the development of the Internet, the amount of data in the digital environment is continuously increasing. Especially with web 2.0 technology, as a result of sites which users are able to add new content such as wikipedia, blogs and social media sites, the amount of information on the internet is increasing both in number and size. Accessing the required information in a medium where there are so many data is a serious problem. Today’s information age make it necessary to use automatic text summarization systems in many areas about information retrieval in order to access the searched information. In this study, text summarization methods based on sentence extraction are discussed, firstly features to represent sentences in document is extracted and then the effectiveness of these attributes on summarization is tried to be determined by using genetic algorithm. The data set used in the study consists of 120 documents containing Turkish news texts and their summaries. 80 documents are trained with the help of genetic algorithm and the best weight values for the attributes are determined, then 40 test documents are summarized with these weights and the results are compared with the original summaries.

___

  • Babar, S. A., & Patil, P. D. (2015). Improving Performance of Text Summarization. Procedia Computer Science, 46, 354-363.
  • Binwahlan, M. S., Salim, N., & Suanmali, L. (2009, April). Swarm based text summarization. In Computer Science and Information Technology-Spring Conference, 2009. IACSITSC'09. International Association of (pp. 145-150). IEEE.
  • Brandow, R., Mitze, K., & Rau, L. F. (1995). Automatic condensation of electronic publications by sentence selection. Information Processing & Management, 31(5), 675-685.
  • Cigir, C., Kutlu, M., & Cicekli, I. (2009, September). Generic text summarization for Turkish. In Computer and Information Sciences, 2009. ISCIS 2009. 24th International Symposium on (pp. 224-229). IEEE.
  • Dalal, V., & Malik, L. G. (2013, December). A survey of extractive and abstractive text summarization techniques. In Emerging Trends in Engineering and Technology (ICETET), 2013 6th International Conference on (pp. 109-110). IEEE. Document understanding conferences (DUC) < http://www-nlpir.nist.gov/projects/duc/index.html >
  • Edmondson, H. P. (1969). New Methods in Automatic Extraction. Journal of the Association for Computing Machinery, vol. 16, no. 2, pp. 264–285, 1969.
  • Edmundson, H. P., & Wyllys, R. E. (1961). Automatic abstracting and indexing—survey and recommendations. Communications of the ACM, 4(5), 226-234.
  • Fattah, M. A., & Ren, F. (2008). Automatic text summarization. Gas, 692, 10785. Gholamrezazadeh, S., Salehi, M. A., & Gholamzadeh, B. (2009). A comprehensive survey on text summarization systems. Proceedings of CSA, 9, 1-6.
  • Gupta, V., & Lehal, G. S. (2010). A survey of text summarization extractive techniques. Journal of emerging technologies in web intelligence, 2(3), 258-268. Holland, John H. (1975) Adaptation in natural and artificial systems. An introductory analysis with application to biology, control, and artificial intelligence. Ann Arbor, MI: University of Michigan Press
  • Khan, A., & Salim, N. (2014). A review on abstractive summarization methods. Journal of Theoretical and Applied Information Technology, 59(1), 64-72.
  • Kupiec, J., Pedersen, J., & Chen, F. (1995, July). A trainable document summarizer. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 68-73). ACM.
  • Kutlu, M., Cıǧır, C., & Cicekli, I. (2010). Generic text summarization for Turkish. The Computer Journal, bxp124.
  • Ledeneva, Y., Gelbukh, A., & García-Hernández, R. A. (2008, February). Terms derived from frequent sequences for extractive text summarization. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 593-604). Springer Berlin Heidelberg.
  • Lee, J. H., Park, S., Ahn, C. M., & Kim, D. (2009). Automatic generic document summarization based on non-negative matrix factorization. Information Processing & Management, 45(1), 20-34.
  • Lin, C. Y. (1999, November). Training a selection function for extraction. In Proceedings of the eighth international conference on Information and knowledge management (pp. 55-62). ACM.
  • Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop (Vol. 8). Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of research and development, 2(2), 159-165. Ozsoy, M. G., Cicekli, I., & Alpaslan, F. N. (2010, August). Text summarization of turkish texts using latent semantic analysis. In Proceedings of the 23rd international conference on computational linguistics (pp. 869-876). Association for Computational Linguistics. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513-523.
  • Suanmali, L., Salim, N., & Binwahlan, M. S. (2011). Genetic algorithm based sentence extraction for text summarization. International Journal of Innovative Computing, 1(1). Torres-Moreno, J. M. (2014). Automatic text summarization. John Wiley & Sons. Uy, N. Q., Anh, P. T., Doan, T. C., & Hoai, N. X. (2012, August). A study on the use of genetic programming for automatic text summarization. In Knowledge and Systems Engineering (KSE), 2012 Fourth International Conference on (pp. 93-98). IEEE.