- See Also
-
Links
-
“
ArtPrompt
: ASCII Art-based Jailbreak Attacks against Aligned LLMs”, Jiang et al 2024 - “Beyond Memorization: Violating Privacy Via Inference With Large Language Models”, Staab et al 2023
- “HyperAttention: Long-context Attention in Near-Linear Time”, Han et al 2023
- “FreshLLMs: Refreshing Large Language Models With Search Engine Augmentation”, Vu et al 2023
- “CausalLM Is Not Optimal for In-context Learning”, Ding et al 2023
- “Simple Synthetic Data Reduces Sycophancy in Large Language Models”, Wei et al 2023
- “Large Language Models Are Few-Shot Health Learners”, Liu et al 2023
- “SeeGULL: A Stereotype Benchmark With Broad Geo-Cultural Coverage Leveraging Generative Models”, Jha et al 2023
- “Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Bitton et al 2023
- “Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models”, Aksitov et al 2023
- “Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation With Interaction”, Pilault et al 2023
- “Memory Augmented Large Language Models Are Computationally Universal”, Schuurmans 2023
- “Med-PaLM: Large Language Models Encode Clinical Knowledge”, Singhal et al 2022
- “Character-Aware Models Improve Visual Text Rendering”, Liu et al 2022
- “Efficiently Scaling Transformer Inference”, Pope et al 2022
- “U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, Tay et al 2022
- “FLAN: Scaling Instruction-Finetuned Language Models”, Chung et al 2022
- “Large Language Models Can Self-Improve”, Huang et al 2022
- “RARR: Attributed Text Generation via Post-hoc Research and Revision”, Gao et al 2022
- “Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, Suzgun et al 2022
- “Language Models Are Multilingual Chain-of-Thought Reasoners”, Shi et al 2022
- “ReAct: Synergizing Reasoning and Acting in Language Models”, Yao et al 2022
- “AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, Soltan et al 2022
- “Inner Monologue: Embodied Reasoning through Planning With Language Models”, Huang et al 2022
- “Solving Quantitative Reasoning Problems With Language Models”, Lewkowycz et al 2022
- “Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, Zhou et al 2022
- “Unifying Language Learning Paradigms”, Tay et al 2022
- “PaLM: Scaling Language Modeling With Pathways”, Chowdhery et al 2022
- “Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”, Ahn et al 2022
- “Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance”, Chowdhery & Narang 2022
- “PaLM § Figure 19: [Explaining a Joke / Inference Chaining] Each ‘Input” Was Independently Prepended With the Same 2-shot Exemplar Shown at the Top, and “Model Output’ Shows the Greedy Decoding Output of PaLM 540B. The Two Exemplar Jokes Are Known Jokes (explanations Written by Authors), While All Evaluated Jokes Were Written by the Authors. Of Course, These Jokes Do Share Abstract Premises With Existing Jokes (wordplay, Reliability, Humorous Analogies, Reversal-of-expectations). The Inference Chaining Examples Were Also Written by the Authors.”
- Sort By Magic
-
“
- Miscellaneous
- Link Bibliography
See Also
Links
“ArtPrompt
: ASCII Art-based Jailbreak Attacks against Aligned LLMs”, Jiang et al 2024
ArtPrompt
: ASCII Art-based Jailbreak Attacks against Aligned LLMs
“Beyond Memorization: Violating Privacy Via Inference With Large Language Models”, Staab et al 2023
Beyond Memorization: Violating Privacy Via Inference with Large Language Models
“HyperAttention: Long-context Attention in Near-Linear Time”, Han et al 2023
“FreshLLMs: Refreshing Large Language Models With Search Engine Augmentation”, Vu et al 2023
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation
“CausalLM Is Not Optimal for In-context Learning”, Ding et al 2023
“Simple Synthetic Data Reduces Sycophancy in Large Language Models”, Wei et al 2023
Simple synthetic data reduces sycophancy in large language models
“Large Language Models Are Few-Shot Health Learners”, Liu et al 2023
“SeeGULL: A Stereotype Benchmark With Broad Geo-Cultural Coverage Leveraging Generative Models”, Jha et al 2023
SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models
“Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Bitton et al 2023
q2d: Turning Questions into Dialogs to Teach Models How to Search
“Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models”, Aksitov et al 2023
Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models
“Interactive-Chain-Prompting (INTERCPT): Ambiguity Resolution for Crosslingual Conditional Generation With Interaction”, Pilault et al 2023
“Memory Augmented Large Language Models Are Computationally Universal”, Schuurmans 2023
Memory Augmented Large Language Models are Computationally Universal
“Med-PaLM: Large Language Models Encode Clinical Knowledge”, Singhal et al 2022
“Character-Aware Models Improve Visual Text Rendering”, Liu et al 2022
“Efficiently Scaling Transformer Inference”, Pope et al 2022
“U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, Tay et al 2022
“FLAN: Scaling Instruction-Finetuned Language Models”, Chung et al 2022
“Large Language Models Can Self-Improve”, Huang et al 2022
“RARR: Attributed Text Generation via Post-hoc Research and Revision”, Gao et al 2022
RARR: Attributed Text Generation via Post-hoc Research and Revision
“Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, Suzgun et al 2022
Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them
“Language Models Are Multilingual Chain-of-Thought Reasoners”, Shi et al 2022
“ReAct: Synergizing Reasoning and Acting in Language Models”, Yao et al 2022
“AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, Soltan et al 2022
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
“Inner Monologue: Embodied Reasoning through Planning With Language Models”, Huang et al 2022
Inner Monologue: Embodied Reasoning through Planning with Language Models
“Solving Quantitative Reasoning Problems With Language Models”, Lewkowycz et al 2022
Solving Quantitative Reasoning Problems with Language Models
“Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, Zhou et al 2022
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
“Unifying Language Learning Paradigms”, Tay et al 2022
“PaLM: Scaling Language Modeling With Pathways”, Chowdhery et al 2022
“Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”, Ahn et al 2022
Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances
“Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance”, Chowdhery & Narang 2022
Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance
“PaLM § Figure 19: [Explaining a Joke / Inference Chaining] Each ‘Input” Was Independently Prepended With the Same 2-shot Exemplar Shown at the Top, and “Model Output’ Shows the Greedy Decoding Output of PaLM 540B. The Two Exemplar Jokes Are Known Jokes (explanations Written by Authors), While All Evaluated Jokes Were Written by the Authors. Of Course, These Jokes Do Share Abstract Premises With Existing Jokes (wordplay, Reliability, Humorous Analogies, Reversal-of-expectations). The Inference Chaining Examples Were Also Written by the Authors.”
Sort By Magic
Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.
Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.
reasoning
language-learning
character-modeling
Miscellaneous
-
/doc/ai/nn/transformer/gpt/palm/2022-ahn-figure2-saycanqueryinglanguagemodelforoptions.png
: -
https://blog.research.google/2022/06/minerva-solving-quantitative-reasoning.html
-
https://blog.research.google/2023/01/google-research-2022-beyond-language.html
-
https://every.to/chain-of-thought/i-spent-a-week-with-gemini-pro-1-5-it-s-fantastic
-
https://minerva-demo.github.io/#category=Algebra&index=1
:View External Link:
-
https://old.reddit.com/r/singularity/comments/1atjz9v/ive_put_a_complex_codebase_into_a_single/
-
https://thezvi.wordpress.com/2024/02/27/the-gemini-incident-continues/
-
https://www.dwarkeshpatel.com/p/demis-hassabis#%C2%A7timestamps
-
https://www.lesswrong.com/posts/EHbJ69JDs4suovpLw/testing-palm-prompts-on-gpt3
:View External Link:
https://www.lesswrong.com/posts/EHbJ69JDs4suovpLw/testing-palm-prompts-on-gpt3
-
https://www.lesswrong.com/posts/YzbQeCiwoLBHrvAh4
:View External Link:
-
https://www.lesswrong.com/posts/mLuQfS7gmfr4nwTdv/google-s-new-540-billion-parameter-language-model
: -
https://www.reddit.com/r/GPT3/comments/twxtwg/how_gpt3_answers_the_google_pathway_sample/
: -
https://www.reddit.com/r/singularity/comments/1atjz9v/ive_put_a_complex_codebase_into_a_single/
-
https://www.theverge.com/2023/3/29/23662621/google-bard-chatgpt-sharegpt-training-denies
Link Bibliography
-
https://arxiv.org/abs/2310.03214#google
: “FreshLLMs: Refreshing Large Language Models With Search Engine Augmentation”, -
https://arxiv.org/abs/2308.03958#deepmind
: “Simple Synthetic Data Reduces Sycophancy in Large Language Models”, Jerry Wei, Da Huang, Yifeng Lu, Denny Zhou, Quoc V. Le -
https://arxiv.org/abs/2304.14318#google
: “Q2d: Turning Questions into Dialogs to Teach Models How to Search”, Yonatan Bitton, Shlomi Cohen-Ganor, Ido Hakimi, Yoad Lewenberg, Roee Aharoni, Enav Weinreb -
https://arxiv.org/abs/2212.13138#google
: “Med-PaLM: Large Language Models Encode Clinical Knowledge”, -
https://arxiv.org/abs/2212.10562#google
: “Character-Aware Models Improve Visual Text Rendering”, -
https://arxiv.org/abs/2211.05102#google
: “Efficiently Scaling Transformer Inference”, -
https://arxiv.org/abs/2210.11399#google
: “U-PaLM: Transcending Scaling Laws With 0.1% Extra Compute”, -
https://arxiv.org/abs/2210.11416#google
: “FLAN: Scaling Instruction-Finetuned Language Models”, -
https://arxiv.org/abs/2210.11610#google
: “Large Language Models Can Self-Improve”, Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han -
https://arxiv.org/abs/2210.08726#google
: “RARR: Attributed Text Generation via Post-hoc Research and Revision”, -
https://arxiv.org/abs/2210.09261#google
: “Challenging BIG-Bench Tasks (BBH) and Whether Chain-of-Thought Can Solve Them”, -
https://arxiv.org/abs/2210.03057#google
: “Language Models Are Multilingual Chain-of-Thought Reasoners”, -
https://arxiv.org/abs/2208.01448#amazon
: “AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model”, -
https://arxiv.org/abs/2207.05608#google
: “Inner Monologue: Embodied Reasoning through Planning With Language Models”, -
https://arxiv.org/abs/2205.10625#google
: “Least-to-Most Prompting Enables Complex Reasoning in Large Language Models”, -
https://arxiv.org/abs/2205.05131#google
: “Unifying Language Learning Paradigms”, -
https://arxiv.org/abs/2204.02311#google
: “PaLM: Scaling Language Modeling With Pathways”, -
https://arxiv.org/abs/2204.01691#google
: “Do As I Can, Not As I Say (SayCan): Grounding Language in Robotic Affordances”,