Representation Learning for Natural Language Processing

معرفی کتاب «Representation Learning for Natural Language Processing» نوشتهٔ Zhiyuan Liu; Yankai Lin; Maosong Sun، منتشرشده توسط نشر Springer Nature Singapore Pte Ltd Fka Springer Science + Business Media Singapore Pte Ltd در سال 2024. این کتاب در فرمت pdf، زبان انگلیسی ارائه شده است. «Representation Learning for Natural Language Processing» در دستهٔ بدون دسته‌بندی قرار دارد.

distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Preface Book Organization Book Cover Note for the Second Edition Prerequisites Contact Information Acknowledgments Acknowledgments for the Second Edition Acknowledgments for the First Edition Contents Contributors Acronyms Symbols and Notations 1 Representation Learning and NLP 1.1 Motivation 1.2 Why Representation Learning Is Important for NLP 1.2.1 Multiple Granularities 1.2.2 Multiple Knowledge 1.2.3 Multiple Tasks 1.2.4 Multiple Domains 1.3 Development of Representation Learning for NLP 1.3.1 Symbolic Representation and Statistical Learning 1.3.2 Distributed Representation and Deep Learning 1.3.3 Going Deeper and Larger with Pre-training on Big Data 1.4 Intellectual Origins of Distributed Representation 1.4.1 Representation Debates in Cognitive Neuroscience 1.4.2 Knowledge Representation in AI 1.4.3 Feature Engineering in Machine Learning 1.4.4 Linguistics 1.5 Representation Learning Approaches in NLP 1.5.1 Feature Engineering 1.5.2 Supervised Representation Learning 1.5.3 Self-supervised Representation Learning 1.6 How to Apply Representation Learning to NLP 1.6.1 Input Augmentation 1.6.2 Architecture Reformulation 1.6.3 Objective Regularization 1.6.4 Parameter Transfer 1.7 Advantages of Distributed Representation Learning 1.8 The Organization of This Book References 2 Word Representation Learning 2.1 Introduction 2.2 Symbolic Word Representation 2.2.1 One-Hot Word Representation 2.2.2 Linguistic KB-based Word Representation 2.2.3 Corpus-based Word Representation 2.3 Distributed Word Representation 2.3.1 Preliminary: Interpreting the Representation 2.3.2 Matrix Factorization-based Word Representation 2.3.3 Word2vec and GloVe 2.3.4 Contextualized Word Representation 2.4 Advanced Topics 2.4.1 Informative Word Representation 2.4.2 Interpretable Word Representation 2.5 Applications 2.5.1 NLP 2.5.2 Cognitive Psychology 2.5.3 History and Social Science 2.6 Summary and Further Readings References 3 Representation Learning for Compositional Semantics 3.1 Introduction 3.2 Binary Composition 3.2.1 Additive Model 3.2.2 Multiplicative Model 3.3 N-ary Composition 3.4 Summary and Further Readings References 4 Sentence and Document Representation Learning 4.1 Introduction 4.2 Symbolic Sentence Representation 4.2.1 Bag-of-Words Model 4.2.2 Probabilistic Language Model 4.3 Neural Language Models 4.3.1 Feed-Forward Neural Network 4.3.2 Convolutional Neural Network 4.3.3 Recurrent Neural Network 4.3.4 Transformer 4.3.5 Enhancing Neural Language Models 4.4 From Sentence to Document Representation 4.4.1 Memory-Based Document Representation 4.4.2 Hierarchical Document Representation 4.5 Applications 4.5.1 Text Classification 4.5.2 Information Retrieval 4.5.3 Reading Comprehension 4.5.4 Open-Domain Question Answering 4.5.5 Sequence Labeling 4.5.6 Sequence-to-Sequence Generation 4.6 Summary and Further Readings References 5 Pre-trained Models for Representation Learning 5.1 Introduction 5.2 Pre-training Tasks 5.2.1 Word-Level Pre-training 5.2.2 Sentence-Level Pre-training 5.3 Model Adaptation 5.3.1 Full-Parameter Fine-Tuning 5.3.2 Delta Tuning 5.3.3 Prompt Learning 5.4 Advanced Topics 5.4.1 Better Model Architecture 5.4.2 Multilingual Representation 5.4.3 Multi-Task Representation 5.4.4 Efficient Representation 5.4.5 Chain-of-Thought Reasoning 5.5 Summary and Further Readings References 6 Graph Representation Learning 6.1 Introduction 6.2 Symbolic Graph Representation 6.3 Shallow Node Representation Learning 6.3.1 Spectral Clustering 6.3.2 Shallow Neural Networks 6.3.3 Matrix Factorization 6.4 Deep Node Representation Learning 6.4.1 Autoencoder-Based Methods 6.4.2 Graph Convolutional Networks 6.4.3 Graph Attention Networks 6.4.4 Graph Recurrent Networks 6.4.5 Graph Transformers 6.4.6 Extensions 6.5 From Node Representation to Graph Representation 6.5.1 Flat Pooling 6.5.2 Hierarchical Pooling 6.6 Self-Supervised Graph Representation Learning 6.7 Applications 6.8 Summary and Further Readings References 7 Cross-Modal Representation Learning 7.1 Introduction 7.2 Cross-Modal Capabilities 7.3 Shallow Cross-Modal Representation Learning 7.4 Deep Cross-Modal Representation Learning 7.4.1 Cross-Modal Understanding 7.4.2 Cross-Modal Retrieval 7.4.3 Cross-Modal Generation 7.5 Deep Cross-Modal Pre-training 7.5.1 Input Representations 7.5.2 Model Architectures 7.5.3 Pre-training Tasks 7.5.4 Adaptation Approaches 7.6 Applications 7.7 Summary and Further Readings References 8 Robust Representation Learning 8.1 Introduction 8.2 Backdoor Robustness 8.2.1 Backdoor Attack on Supervised Representation Learning 8.2.2 Backdoor Attack on Self-Supervised Representation Learning 8.2.3 Backdoor Defense 8.2.4 Toolkits 8.3 Adversarial Robustness 8.3.1 Adversarial Attack 8.3.2 Adversarial Defense 8.3.3 Toolkits 8.4 Out-of-Distribution Robustness 8.4.1 Spurious Correlation 8.4.2 Domain Shift 8.4.3 Subpopulation Shift 8.5 Interpretability 8.5.1 Understanding Model Functionality 8.5.2 Explaining Model Mechanism 8.6 Summary and Further Readings References 9 Knowledge Representation Learning and Knowledge-Guided NLP 9.1 Introduction 9.2 Symbolic Knowledge and Model Knowledge 9.2.1 Symbolic Knowledge 9.2.2 Model Knowledge 9.2.3 Integrating Symbolic Knowledge and Model Knowledge 9.3 Knowledge Representation Learning 9.3.1 Linear Representation 9.3.2 Translation Representation 9.3.3 Neural Representation 9.3.4 Manifold Representation 9.3.5 Contextualized Representation 9.3.6 Summary 9.4 Knowledge-Guided NLP 9.4.1 Knowledge Augmentation 9.4.2 Knowledge Reformulation 9.4.3 Knowledge Regularization 9.4.4 Knowledge Transfer 9.4.5 Summary 9.5 Knowledge Acquisition 9.5.1 Sentence-Level Relation Extraction 9.5.2 Bag-Level Relation Extraction 9.5.3 Document-Level Relation Extraction 9.5.4 Few-Shot Relation Extraction 9.5.5 Open-Domain Relation Extraction 9.5.6 Contextualized Relation Extraction 9.5.7 Summary 9.6 Summary and Further Readings References 10 Sememe-Based Lexical Knowledge Representation Learning 10.1 Introduction 10.2 Linguistic and Commonsense Knowledge Bases 10.2.1 WordNet and ConceptNet 10.2.2 HowNet 10.2.3 HowNet and Deep Learning 10.3 Sememe Knowledge Representation 10.3.1 Sememe-Encoded Word Representation 10.3.2 Sememe-Regularized Word Representation 10.4 Sememe-Guided Natural Language Processing 10.4.1 Sememe-Guided Semantic Compositionality Modeling 10.4.2 Sememe-Guided Language Modeling 10.4.3 Sememe-Guided Recurrent Neural Networks 10.5 Automatic Sememe Knowledge Acquisition 10.5.1 Embedding-Based Sememe Prediction 10.5.2 Sememe Prediction with Internal Information 10.5.3 Cross-lingual Sememe Prediction 10.5.4 Connecting HowNet with BabelNet 10.5.5 Summary and Discussion 10.6 Applications 10.6.1 Chinese LIWC Lexicon Expansion 10.6.2 Reverse Dictionary 10.7 Summary and Further Readings References 11 Legal Knowledge Representation Learning 11.1 Introduction 11.2 Typical Tasks and Real-World Applications 11.3 Legal Knowledge Representation and Acquisition 11.3.1 Legal Textual Knowledge 11.3.2 Legal Structured Knowledge 11.3.3 Discussion 11.4 Knowledge-Guided Legal NLP 11.4.1 Input Augmentation 11.4.2 Architecture Reformulation 11.4.3 Objective Regularization 11.4.4 Parameter Transfer 11.5 Outlook 11.6 Ethical Consideration 11.7 Open Competitions and Benchmarks 11.8 Summary and Further Readings References 12 Biomedical Knowledge Representation Learning 12.1 Introduction 12.1.1 Perspectives for Biomedical NLP 12.1.2 Role of Knowledge in Biomedical NLP 12.2 Biomedical Knowledge Representation and Acquisition 12.2.1 Biomedical Knowledge from Natural Language 12.2.2 Biomedical Knowledge from Biomedical Language Materials 12.3 Knowledge-Guided Biomedical NLP 12.3.1 Input Augmentation 12.3.2 Architecture Reformulation 12.3.3 Objective Regularization 12.3.4 Parameter Transfer 12.4 Typical Applications 12.4.1 Literature Processing 12.4.2 Retrosynthetic Prediction 12.4.3 Diagnosis Assistance 12.5 Advanced Topics 12.6 Summary and Further Readings References 13 OpenBMB: Big Model Systems for Large-Scale Representation Learning 13.1 Introduction 13.2 BMTrain: Efficient Training Toolkit for Big Models 13.2.1 Data Parallelism 13.2.2 ZeRO Optimization 13.2.3 Quickstart of BMTrain 13.3 OpenPrompt and OpenDelta: Efficient Tuning Toolkit for Big Models 13.3.1 Serving Multiple Tasks with a Unified Big Model 13.3.2 Quickstart of OpenPrompt 13.3.3 QuickStart of OpenDelta 13.4 BMCook: Efficient Compression Toolkit for Big Models 13.4.1 Model Quantization 13.4.2 Model Distillation 13.4.3 Model Pruning 13.4.4 Model MoEfication 13.4.5 QuickStart of BMCook 13.5 BMInf: Efficient Inference Toolkit for Big Models 13.5.1 Accelerating Big Model Inference 13.5.2 Reducing the Memory Footprint of Big Models 13.5.3 QuickStart of BMInf 13.6 Summary and Further Readings References 14 Ten Key Problems of Pre-trained Models: An Outlook of Representation Learning 14.1 Pre-trained Models: New Era of Representation Learning 14.2 Ten Key Problems of Pre-trained Models 14.2.1 P1: Theoretical Foundation of Pre-trained Models 14.2.2 P2: Next-Generation Model Architecture 14.2.3 P3: High-Performance Computing of Big Models 14.2.4 P4: Effective and Efficient Adaptation 14.2.5 P5: Controllable Generation with Pre-trained Models 14.2.6 P6: Safe and Ethical Big Models 14.2.7 P7: Cross-Modal Computation 14.2.8 P8: Cognitive Learning 14.2.9 P9: Innovative Applications of Big Models 14.2.10 P10: Big Model Systems Accessible to Users 14.3 Summary References This book provides an overview of the recent advances in representation learning theory, algorithms, and applications for natural language processing (NLP), ranging from word embeddings to pre-trained language models. It is divided into four parts. Part I presents the representation learning techniques for multiple language entries, including words, sentences and documents, as well as pre-training techniques. Part II then introduces the related representation techniques to NLP, including graphs, cross-modal entries, and robustness. Part III then introduces the representation techniques for the knowledge that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, legal domain knowledge and biomedical domain knowledge. Lastly, Part IV discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing. As compared to the first edition, the second edition (1) provides a more detailed introduction to representation learning in Chapter 1; (2) adds four new chapters to introduce pre-trained language models, robust representation learning, legal knowledge representation learning and biomedical knowledge representation learning; (3) updates recent advances in representation learning in all chapters; and (4) corrects some errors in the first edition. The new contents will be approximately 50%+ compared to the first edition. This is an open access book. This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing.

دانلود کتاب Representation Learning for Natural Language Processing