客服

添加客服微信

公眾號(hào)

關(guān)注微信公眾號(hào)

升級(jí)會(huì)員

換一換

預(yù)覽

清華大學(xué)：2024大模型工具學(xué)習(xí)報(bào)告（英文版）（48頁(yè)）.pdf

資源ID：1041242 資源大小：9.52MB 全文頁(yè)數(shù)：48頁(yè)
資源格式： PDF 中文版下載積分： 20金幣

下載報(bào)告請(qǐng)您先登錄！

驗(yàn)證碼下載游客一鍵下載

賬號(hào)登錄下載

微信登錄下載

微信掃一掃登錄

下載資源需要20金幣

郵箱/手機(jī)：
驗(yàn)證碼：	獲取驗(yàn)證碼
溫馨提示：	提交成功后，系統(tǒng)會(huì)自動(dòng)生成賬號(hào)（用戶名為郵箱或者手機(jī)號(hào)，密碼是驗(yàn)證碼），方便下次登錄下載和查詢訂單；
支付說(shuō)明：	本站最低充值99金幣，下載本資源后余額將會(huì)存入您的賬戶，您可在我的個(gè)人中心查看。
驗(yàn)證碼：	換一換

加入VIP，免費(fèi)下載

賬號(hào)：
密碼：
驗(yàn)證碼：	換一換
當(dāng)日自動(dòng)登錄忘記密碼？

友情提示

1、下載資料失敗解決辦法

2、PDF文件下載后，可能會(huì)被瀏覽器默認(rèn)打開(kāi)，此種情況可以點(diǎn)擊瀏覽器菜單，保存網(wǎng)頁(yè)到桌面，就可以正常下載了。

3、本站不支持迅雷下載，請(qǐng)使用電腦自帶的IE瀏覽器，或者360瀏覽器、谷歌瀏覽器下載即可。

4、本站資源下載后的文檔和圖紙-無(wú)水印,預(yù)覽文檔經(jīng)過(guò)壓縮，下載后原文更清晰。

5、試題試卷類文檔，如果標(biāo)題沒(méi)有明確說(shuō)明有答案則都視為沒(méi)有答案，請(qǐng)知曉。

清華大學(xué)：2024大模型工具學(xué)習(xí)報(bào)告（英文版）（48頁(yè)）.pdf

1、THUNLPTool Learning秦禹嘉0THUNLPBackground1 Tools are extensions of human capabilities designed to enhance productivity,efficiency,and problem-solving Throughout history,humans have been the primary agents in the invention and manipulation of tools Question:can artificial intelligence be as capable as

2、humans in tool use?2Tools and IntelligenceTools and Intelligence The answer is yes with foundation models Strong semantic understanding Extensive world knowledge Powerful reasoning and planning capabilities3Tools and IntelligenceTools and Intelligence4Tools and IntelligenceTools and Intelligence Too

3、l Learning 1:foundation models can follow human instructions and manipulate tools for task solving1 Qin,Yujia,et al.Tool Learning with Foundation Models.arXiv preprint arXiv:2304.08354(2023).Tool-augmented learning Augment foundation models with the execution results from tools Tools are viewed as c

4、omplementary resources that aid in the generation of high-quality outputs5Categorization of Tool LearningCategorization of Tool Learning6Categorization of Tool LearningCategorization of Tool Learning Tool-oriented learning Utilize models to govern tools and make sequential decisions in place of huma

5、ns Exploiting foundation models vast world knowledge and reasoning ability for complex reasoning and planningTHUNLPFramework78FrameworkFrameworkTool Set:a collection of tools with different functionalitiesEnvironment provides the platform where tools operateThe perceiver summarizes feedback to the c

6、ontrollerController provides feasible plans to fulfill user requests Comprehending the underlying purpose of an instruction Learning a mapping from the instruction space to the models cognition space Instruction Tuning9Intent UnderstandingIntent Understanding Wrap tasks with diverse instructions Sup

7、ervised fine-tuning Extraordinary generalization capability1 Finetuned Language Models Are Zero-Shot Learners2 Multitask Prompted Training Enables Zero-Shot Task Generalization 3 OPT-IML:Scaling Language Model Instruction Meta Learning through the Lens of Generalizationuserid:444287,docid:155342,dat

8、e:2024-05-19, Scaling up the model size and the diversity of instruction-tuning datasets Enhancement of generalization capability Challenges Understanding Vague Instructions:vagueness and ambiguity in the user query Theoretically Infinite Instruction Space:infinite expression and personalized instru

9、ctions 10Intent UnderstandingIntent Understanding11Tool UnderstandingTool Understanding Eliciting tool understanding with prompting Zero-shot prompting:Describe API functionalities,their input/output formats,possible parameters,etc.Allow the model to understand the tasks that each API can tackle Few

10、-shot prompting:Provide concrete tool-use demonstrations to the model By mimicking human behaviors from these demonstrations,the model can learn how to utilize these tools12Tool UnderstandingTool Understanding Eliciting tool understanding with prompting Introspective Reasoning Generate a static plan

11、 without interacting with the environment Extrospective Reasoning Generate a dynamic plan considering the change of environment and feedbacks13Planning and ReasoningPlanning and Reasoning Introspective Reasoning If prompted appropriately,PLMs can effectively decompose high-level tasks into mid-level

12、 plans without any further training14Planning and ReasoningPlanning and ReasoningLanguage Models as Zero-Shot Planners:Extracting Actionable Knowledge for Embodied Agents Extrospective Reasoning Challenge:foundation models are not embodied or grounded to the physical world Solution:constrain the mod

13、el to propose natural language actions that are both feasible and contextually appropriate15Planning and ReasoningPlanning and ReasoningDo as I can,Not as I say!Ahn,Michael,et al.Do as i can,not as i say:Grounding language in robotic affordances.arXiv preprint arXiv:2204.01691(2022).Extrospective Re

14、asoning Inner Monologue 1:injecting information from various sources of feedback into model planning16Planning and ReasoningPlanning and Reasoning1 Huang,Wenlong,et al.Inner monologue:Embodied reasoning through planning with language models.arXiv preprint arXiv:2207.05608(2022).Multi-step Multi-tool

15、 Scenarios Humans wont stick to one scenario and one tool Understanding the Interplay among Different Tools Models should not only understand individual tools,but learn their combination usage and order the tools logically From Sequential Execution to Parallel Execution Tools do not have to be perfo

16、rmed sequentially,parallel performing leads to superimposed effects From Single-agent Problem-Solving to Multi-agent Collaboration Complex tasks often necessitate collaboration among multiple agents,each with their unique expertise17Planning and ReasoningPlanning and Reasoning Learning from demonstr

17、ations:often involves(human)annotations Learning from feedback:often involves reinforcement learning18Training StrategiesTraining Strategies Supervised Learning Clone human behavior to use search engines Supervised fine-tuning+reinforcement learning Only need 6,000 annotated data19WebGPTWebGPTNakano

18、,Reiichiro,et al.WebGPT:Browser-assisted question-answering with human feedback.arXiv preprint arXiv:2112.09332(2021).Motivation WebGPT is not public,and its inner workings remain opaque Our Efforts(WebCPM)Open-source interactive web search interface The first public QA dataset that involves interac

19、tive web search,and also the first Chinese LFQA dataset Framework and Model Implementation20WebCPMWebCPM Interface(search mode)and pre-defined actions21WebCPMWebCPM22WebCPMWebCPM Our framework consists of two models:1.Search model,consisting of:Action prediction module Search query generation module

20、 Supporting fact extraction module 2.Information synthesis model23WebCPMWebCPMFor an action sequence of T steps,the search model executes actions to collect supporting facts,which are sent to the synthesis model for answer generation.24WebCPMWebCPMHolistic Pipeline Evaluation(based on human preferen

21、ce)Model-generated Answer v.s.Human AnnotationThree sources of supporting facts are sent to the synthesis model(1)pipeline-collected,(2)human-collected,(3)non-interactive search(TF-IDF)25WebCPMWebCPM Learning to perform online shopping26WebShopWebShop Self-supervised Tool Learning Pre-defined tool A

22、PIs Encourage models to call and execute tool APIs Design self-supervised loss to see if the tool execution can help language modeling27ToolformerToolformerIf the tool execution reduces LM loss,save the instances as training data From Tool User to Tool Creator Humans are the primary agents that crea

23、te and use tools from Stone Age to 21st century Most tools are created for humans,not AI Tools Made for Models Modularized:compose tools into smaller units New input and output formats:more computable and suitable for AI28Tool CreationTool Creation29Tool CreationTool Creation Limitations of Existing

24、 Works Most existing work tends to concentrate on a limited number of tools The reasoning process employed by models for determining the optimal utilization of tools is inherently complex The current pipelines lack a error-handling mechanism after retrieving execution results Instead of letting LLMs

25、 act as the users of tools,we enable them to be the creators 130Tool CreationTool CreationQian,Cheng,et al.CREATOR:Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation.31Tool CreationTool Creation Four Procedures Creation Decision Execution Rectification32Too

26、l CreationTool Creation Experiments Datasts:MATH,TabMWP Significant improvements over PoT and pure CoTTHUNLPApplication33 OpenAIs official tool library Empower ChatGPT with broader applications By simply providing APIs with descriptions,ChatGPT is enabled to call applications and complete more compl

27、ex tasks34ChatGPTChatGPT PluginsPlugins BMTools An open-source repository that extends language models to use tools and serves as a platform for the community to build and share tools35OpenOpen-source Solutionssource Solutions Features:Users can easily build a new plugin by writing python functions

28、and use external ChatGPT-Plugins Users can host their local models(e.g.,LLaMA,CPM)to use tools36OpenOpen-source Solutionssource Solutionshttps:/ Features:30+tools tools supported,welcome contributing!37OpenOpen-source Solutionssource SolutionsdatabaseWeather APIPPTGoogle ScholarHuggingface ModelsIma

29、ge Generationhttps:/ Features:Support BabyAGI and AutoGPT 100k+tool-use SFT data on the way!38OpenOpen-source Solutionssource Solutionshttps:/ Solutionssource Solutions40OpenOpen-source Solutionssource Solutions ToolBench An open-source,large-scale,high-quality instruction tuning SFT data to facilit

30、ate general tool-use capability We provide the dataset,the corresponding training and evaluation scripts,and a capable model ToolLLaMA fine-tuned on ToolBenchhttps:/ Solutionssource Solutions Features Both single-tool and multi-tool scenarios are supported ToolBench provides responses that not only

31、include the final answer but also incorporate the models chain-of-thought process,tool execution,and tool execution results Multi-step decision making and tool execution Another notable advantage is the diversity of our API,which is designed for real-world scenarios 98k instances,312k API callshttps

32、:/ Solutionssource Solutions Construction Process All the data is automatically generated by OpenAI API and then filtered,the whole data creation process is easy to scale uphttps:/ Solutionssource Solutions Creation Process We provide the dataset,the corresponding training and evaluation scripts,and

33、 a capable model ToolLLaMAhttps:/ Solutionssource Solutions Evaluation ToolLLaMA matches ChatGPTs capabilities in tool use Auto-evaluated by ChatGPT(higher is better)https:/ Traditional language tasks are(almost)well solved Syntactic parsing,entity recognition,sentiment analysis We are facing more c

34、hallenging tasks!Foundation models can be leveraged in complex scenarios by using language,and the performance may largely rely on LLMs effectiveness Theoretical issues still exist Practical issues still exist Explore leveraging tool learning in complex scenarios46Tool Learning Paper ListTool Learning Paper Listhttps:/

注意事項(xiàng): 本文（清華大學(xué)：2024大模型工具學(xué)習(xí)報(bào)告（英文版）（48頁(yè)）.pdf）為本站會(huì)員（新***）主動(dòng)上傳，地產(chǎn)文庫(kù)僅提供信息存儲(chǔ)空間，僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對(duì)上載內(nèi)容本身不做任何修改或編輯。若此文所含內(nèi)容侵犯了您的版權(quán)或隱私，請(qǐng)立即通知地產(chǎn)文庫(kù)（點(diǎn)擊聯(lián)系客服），我們立即給予刪除！