Enhancing Abstractive Summarization with Extracted Knowledge Graphs and Multi-Source Transformers
Abstract
:1. Introduction
- Wiki-Sum, a dataset that is extracted from the Wikipedia textural dump, then tokenized and extracted into an abstract summary form;
- MultiBART-GAT, a framework for abstractive text summarization incorporating transformers with graph representation augmentation.
2. Related Work
3. Problem Statement
3.1. Encoder–Decoder Framework for Summarization
3.2. Using Knowledge Graphs to Augment Summarization
4. Model Formulation
4.1. Multi-Headed Transformers
4.2. Graph Attention
4.3. Encoder–Decoder Network
4.4. Initialization, Training, and Loss Function
5. Dataset: Wiki-Sum
5.1. Data Collection
5.2. Dataset Analysis
- Locations: Canada; Akron, Ohio; Gwangju; Baja California Peninsula
- People: Lee Se-young; Sandeep Vanga; Harvey Weinstein; Shahid Khan
- Concepts: Civil engineering; Aryan race; Denial-of-service attack; Student’st-test
- Films, TV Series, Literature: Doctor Who; The Last of Us; The Exorcist (film); Designated Survivor(season 2)
- Organizations: Ferrari; Roman Republic; New York Yankees; Princeton University
- Events: Spanish Civil War; Hurricane Irma; European colonization of the Americas; 2017–18 NHL season
5.3. Knowledge Graph Construction
6. Experiments
6.1. Baselines and Comparison
6.2. Metrics: ROUGE
6.3. Data Augmentation on Our Dataset
6.4. Running Models with Reduced CNN-DailyMail Dataset
7. Results
7.1. Automatic Evaluation
7.2. Result Analysis
8. Conclusions
- Dataset cleaning. To make the dataset more appropriate, it would be better to conduct data pruning by hand or through mechanical turks to form more concise versions of long Wikipedia articles that preserve the majority of information. Stubs still included in the dataset or special articles without an introduction should also be removed.Furthermore, it is worth mentioning that abstracts for Wikipedia articles describing names and places usually contain information (e.g., full names and birth and death dates) that is exclusively saved in the infobox on the side. We think it would be more appropriate for these facts to be included in the source article in some form, such as in automatically generated sentences.
- Model design. On the one hand, other graph encoders like GCN or GGNN should also be considered as concatenated structural inputs. On the other hand, inputs to the encoder can also include extracted entities in the form of emphasized tags. This is also a way to include internal semantics as input enrichment.
- Training. We think that there is much room for improvement regarding the speed of model convergence; one way to do this is to use a loss that focuses more on node selection.
- Future directions. In the final design of our model, we wish to incorporate a multi-hop generation that controls the story flow coupled with regularly trained language models that have been converted for text generation. This will lead to an LLM that is able to align to verified true storylines and make fewer factual errors.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- See, A.; Liu, P.J.; Manning, C.D. Get to the point: Summarization with pointer-generator networks. arXiv 2017, arXiv:1704.04368. [Google Scholar]
- Nallapati, R.; Zhou, B.; dos Santos, C.; Gulçehre, Ç.; Xiang, B. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the Conference on Computational Natural Language Learning, Berlin, Germany, 7–12 August 2016. [Google Scholar]
- Kenton, J.D.M.W.C.; Toutanova, L.K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the NAACL-HLT, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- OpenAI. GPT-4 Technical Report. 2023. Available online: http://xxx.lanl.gov/abs/2303.08774 (accessed on 24 June 2023).
- Lin, C.Y. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out; Association for Computational Linguistics: Barcelona, Spain, 2004; pp. 74–81. [Google Scholar]
- Cao, Z.; Wei, F.; Li, W.; Li, S. Faithful to the original: Fact-aware neural abstractive summarization. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, New Orleans, LA, USA, 2–7 February 2018; pp. 4784–4791. [Google Scholar]
- Paulus, R.; Xiong, C.; Socher, R. A Deep Reinforced Model for Abstractive Summarization. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Gehrmann, S.; Deng, Y.; Rush, A.M. Bottom-up abstractive summarization. arXiv 2018, arXiv:1808.10792. [Google Scholar]
- Dong, L.; Yang, N.; Wang, W.; Wei, F.; Liu, X.; Wang, Y.; Gao, J.; Zhou, M.; Hon, H.W. Unified language model pre-training for natural language understanding and generation. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 13063–13075. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Song, L.; Zhang, Y.; Wang, Z.; Gildea, D. A Graph-to-Sequence Model for AMR-to-Text Generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; Volume 1: Long Papers, pp. 1616–1626. [Google Scholar]
- Beck, D.; Haffari, G.; Cohn, T. Graph-to-Sequence Learning using Gated Graph Neural Networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; Volume 1: Long Papers, pp. 273–283. [Google Scholar]
- Damonte, M.; Cohen, S.B. Structural Neural Encoders for AMR-to-text Generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; Volume 1 (Long and Short Papers), pp. 3649–3658. [Google Scholar]
- Dou, Z.Y.; Liu, P.; Hayashi, H.; Jiang, Z.; Neubig, G. GSum: A General Framework for Guided Neural Abstractive Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; pp. 4830–4842. [Google Scholar]
- Vodolazova, T.; Lloret, E. The Impact of Rule-Based Text Generation on the Quality of Abstractive Summaries. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), Varna, Bulgaria, 2–4 September 2019; pp. 1275–1284. [Google Scholar]
- Lin, B.Y.; Shen, M.; Xing, Y.; Zhou, P.; Ren, X. CommonGen: A constrained text generation dataset towards generative commonsense reasoning. arXiv 2019, arXiv:1911.03705. [Google Scholar]
- Guan, J.; Huang, F.; Zhao, Z.; Zhu, X.; Huang, M. A knowledge-enhanced pretraining model for commonsense story generation. Trans. Assoc. Comput. Linguist. 2020, 8, 93–108. [Google Scholar] [CrossRef]
- Dhingra, B.; Zaheer, M.; Balachandran, V.; Neubig, G.; Salakhutdinov, R.; Cohen, W.W. Differentiable reasoning over a virtual knowledge base. arXiv 2020, arXiv:2002.10640. [Google Scholar]
- Zhou, H.; Young, T.; Huang, M.; Zhao, H.; Xu, J.; Zhu, X. Commonsense Knowledge Aware Conversation Generation with Graph Attention. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018; pp. 4623–4629. [Google Scholar]
- Hajdik, V.; Buys, J.; Goodman, M.W.; Bender, E.M. Neural Text Generation from Rich Semantic Representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; Volume 1 (Long and Short Papers), pp. 2259–2266. [Google Scholar]
- Marcheggiani, D.; Perez-Beltrachini, L. Deep Graph Convolutional Encoders for Structured Data to Text Generation. In Proceedings of the 11th International Conference on Natural Language Generation, Tilburg, The Netherlands, 5–8 November 2018; pp. 1–9. [Google Scholar]
- Wu, Z.; Koncel-Kedziorski, R.; Ostendorf, M.; Hajishirzi, H. Extracting Summary Knowledge Graphs from Long Documents. arXiv 2020, arXiv:2009.09162. [Google Scholar]
- Hou, S.; Lu, R. Knowledge-guided unsupervised rhetorical parsing for text summarization. Inf. Syst. 2020, 94, 101615. [Google Scholar] [CrossRef]
- Zhu, C.; Hinthorn, W.; Xu, R.; Zeng, Q.; Zeng, M.; Huang, X.; Jiang, M. Boosting factual correctness of abstractive summarization with knowledge graph. arXiv 2020, arXiv:2003.08612. [Google Scholar]
- Gunel, B.; Zhu, C.; Zeng, M.; Huang, X. Mind The Facts: Knowledge-Boosted Coherent Abstractive Text Summarization. arXiv 2020, arXiv:2006.15435. [Google Scholar]
- Huang, L.; Wu, L.; Wang, L. Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 2787–2795. [Google Scholar]
- Angeli, G.; Premkumar, M.J.J.; Manning, C.D. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; Volume 1: Long Papers, pp. 344–354. [Google Scholar]
Training set | 77,545 |
Validation set | 9697 |
Test set | 9698 |
Topic | Words |
---|---|
Topic 1 (history) | war, force, army |
Topic 2 (literature) | write, century, work |
Topic 3 (sports) | season, game, team |
Topic 4 (media) | film, release, series |
Topic 5 (geography) | city, north, south |
Topic 6 (politics) | government, party, united |
Topic 7 (education) | university, school, program |
Topic 8 (technology) | system, formula, process |
Text | Summary |
---|---|
Hoffa was born in Brazil, Indiana, on 14 February 1913, to John and Viola (née Riddle) Hoffa. From an early age, Hoffa was a union activist, and he became an important regional figure with the IBT by their mid-twenties. By 1952, he was the national vice-president of the IBT and was its general president between 1957 and 1971. He secured the first national agreement for teamsters’ rates in 1964 with the National Master Freight Agreement. He played a major role in the growth and the development of the union, which eventually became the largest by membership in the United States, with over 2.3 million members at its peak, during their terms as its leader. | |
Hoffa became involved with organized crime from the early years of their Teamsters work, a connection that continued until their disappearance in 1975. He was convicted of jury tampering, attempted bribery, conspiracy, and mail and wire fraud in 1964 in two separate trials. He was imprisoned in 1967 and sentenced to 13 years. In mid-1971, he resigned as president of the union as part of a commutation agreement with US President Richard Nixon and was released later that year, but Hoffa was barred from union activities until 1980. Hoping to regain support and to return to IBT leadership, he unsuccessfully tried to overturn the order. | |
Hoffa disappeared on 30 July 1975. He is believed to have been murdered by the Mafia and was declared legally dead in 1982. Hoffa’s legacy continues to stir debate. | |
... | James Riddle Hoffa (born 14 February 1913—disappeared 30 July 1975, declared dead 30 July 1982) was an American labor union leader who served as the president of the International Brotherhood of Teamsters (IBT) from 1957 until 1971. |
The origins of the company are complex, going back to the early 20th century and the initial enterprises (Horch and the Audiwerke) founded by engineer August Horch; and two other manufacturers (DKW and Wanderer), leading to the foundation of Auto Union in 1932. The modern era of Audi essentially began in the 1960s when Auto Union was acquired by Volkswagen from Daimler-Benz. After relaunching the Audi brand with the 1965 introduction of the Audi F103 series, Volkswagen merged Auto Union with NSU Motorenwerke in 1969, thus creating the present day form of the company. | |
The company name is based on the Latin translation of the surname of the founder, August Horch. “Horch”, meaning “listen” in German, becomes “audi” in Latin. The four rings of the Audi logo each represent one of four car companies that banded together to create Audi’s predecessor company, Auto Union. Audi’s slogan is Vorsprung durch Technik, meaning “Being Ahead through Technology”. Audi, along with fellow German marques BMW and Mercedes-Benz, is among the best-selling luxury automobile brands in the world. | |
... | Audi AG is a German automobile manufacturer that designs, engineers, produces, markets and distributes luxury vehicles. Audi is a wholly owned subsidiary of the Volkswagen Group and has its roots at Ingolstadt, Bavaria, Germany. Audi-branded vehicles are produced in nine production facilities worldwide. |
The balance of trade forms part of the current account, which includes other transactions such as income from the net international investment position as well as international aid. If the current account is in surplus, the country’s net international asset position increases correspondingly. Equally, a deficit decreases the net international asset position. | |
Many countries in early modern Europe adopted a policy of mercantilism, which theorized that a trade surplus was beneficial to a country, among other elements such as colonialism and trade barriers with other countries and their colonies. (Bullionism was an early philosophy supporting mercantilism.) | |
In March 2019, Armenia recorded a trade deficit of US$203.9 million. For the last two decades, the Armenian trade balance has been negative, reaching an all-time high of –33.98 USD million in August 2003. The reason for the trade deficit is that Armenia’s foreign trade is limited by its landlocked location and border disputes with Turkey and Azerbaijan, to the west and east, respectively. The situation results in the country’s typically reporting large trade deficits. | |
... | The balance of trade, commercial balance, or net exports (sometimes symbolized as NX) is the difference between the monetary value of a nation’s exports and imports over a certain time period. Sometimes a distinction is made between a balance of trade for goods versus one for services. The balance of trade measures the flow of exports and imports over a given period of time. |
Models | Original | Reduced | ||||
---|---|---|---|---|---|---|
ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-1 | ROUGE-2 | ROUGE-L | |
BART | 44.16 | 21.28 | 40.90 | 24.31 | 10.99 | 22.91 |
ASGARD | 43.93 | 20.37 | 40.48 | 36.61 | 14.82 | 33.73 |
MultiBART-GAT | 35.74 | 20.03 | 35.44 | 27.73 | 10.06 | 27.17 |
Models | Wiki-Sum | ||
---|---|---|---|
ROUGE-1 | ROUGE-2 | ROUGE-L | |
BART | 29.24 | 8.57 | 26.62 |
ASGARD | 31.74 | 13.48 | 29.02 |
MultiBART-GAT | 30.27 | 10.02 | 24.101 |
Ground Truth | Summary by BART | Summary by ASGARD |
---|---|---|
Georgia is a transcontinental country in the Caucasus region of Eurasia. Located at the crossroads of Western Asia and Eastern Europe, it is bounded to the west by the Black Sea, to the north by Russia, to the east by Azerbaijan, and to the south by Armenia and Turkey. The capital and largest city is Tbilisi. | The Roman Empire, also known as the Roman Republic, was a Roman catholic church in the United States. It was founded in 1848 by the Russian Empire, and was the second largest in the world. It is one of the most populous city of the world’s most populous state. | Georgia, also known as the Georgia or Georgian, is a feudal region of the Persian peninsula. It is the largest of the world’s largest city in the world. |
Blues Brothers 2000 is a 1998 American musical comedy film that is a sequel to the 1980 film The Blues Brothers, written and produced by John Landis and Dan Aykroyd. Directed by Landis, the film stars Aykroyd and John Goodman, with cameo appearances by various musicians. The film is dedicated to John Belushi, Cab Calloway, and John Candy, cast members from the original film who had died prior to the sequel’s production, as well as Junior Wells, who died one month before it was released. | David is a 2017 American comedy–drama film written and directed by James Walt Disney. The film is based on the novel of the same name by Jimmy Lee. It is the second installment in the “Star Trek” film series. | Y Blues Brothers 2000 is a 1998 American musical film directed by Jim Belushi, starring Joe Morton, and Joliet John Belushi. The film was released on 14 October 2000. |
Redlining is the systematic denial of various services or goods by federal government agencies, local governments, or the private sector either directly or through the selective raising of prices. This is often manifested by placing strict criteria on specific services and goods that often disadvantage poor and minority communities. | The U.S. state company is an American multinational company headquartered in New York City, California. The company was founded in the United States, with the company’s largest company in the world, and was founded by the United Kingdom and the company of London. | Redlining, also known as the redlining or redlining, is a practice that is used to refer to the practice of the Housing Act of the United States. |
Ground Truth | Summary by ASGARD | Summary by MultiBART-GAT |
---|---|---|
Novak Djokovic beat Andy Murray 7-6 4-6 6-0 in Miami Open 2015 final. Djokovic lost their cool after losing the second set to the Brit in Florida. World No 1 Djokovic shouted at their support team next to a scared ball boy. After seeing the replay, the Serbian posted an apology video on Facebook. | Novak Djokovic issued an apology to a ball boy he frightened during the Miami Open final to Andy Murray. The world No 1 grabbed a towel to the ball boy’s parents. | Novak Djokovic has posted an apology video on Facebook after losing their cool after losing the Miami Open final to Andy Murray. The Serbian world No 1 lost their cool in the second set of the final against the Brit and shouted at their support team next to a scared ball boy as he was beaten 7-6 4-6 6-0 by Murray in the final of the tournament in Florida on Sunday. |
Alfred Taubman, who died on friday, was active in philanthropy and worth an estimated USD 3.1 billion. Amid the suburban boom of the ‘50s’, he realized people would need places to shop: ”we could not miss.” We were convicted in 2002 of trying to rig auction house commissions; he maintained he was innocent. | The Robert Alfred S. Alfred Taubman died on Friday at their home of a heart attack at a home after a heart at a heart of their heart attack. He donated and manages owns and manages 19 regional shopping centers and owns 19 regional and currently owns 19 and owns and a manages 19 years ago. | Alfred Taubman, a billionaire who was convicted in 2002 of trying to rig auction house commissions, has died at the age of 90. Taubmann, who was born in New York City, was the co-founder of We Auctions, one of the nation’s largest auction houses. He died Friday at their home in the Bronx, New York, after a long battle with cancer, according to a statement from their family. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, T.; Wang, X.; Yue, T.; Bai, X.; Le, C.X.; Wang, W. Enhancing Abstractive Summarization with Extracted Knowledge Graphs and Multi-Source Transformers. Appl. Sci. 2023, 13, 7753. https://doi.org/10.3390/app13137753
Chen T, Wang X, Yue T, Bai X, Le CX, Wang W. Enhancing Abstractive Summarization with Extracted Knowledge Graphs and Multi-Source Transformers. Applied Sciences. 2023; 13(13):7753. https://doi.org/10.3390/app13137753
Chicago/Turabian StyleChen, Tong, Xuewei Wang, Tianwei Yue, Xiaoyu Bai, Cindy X. Le, and Wenping Wang. 2023. "Enhancing Abstractive Summarization with Extracted Knowledge Graphs and Multi-Source Transformers" Applied Sciences 13, no. 13: 7753. https://doi.org/10.3390/app13137753