SQL应用于LLM的程序开发利器——开源LMQL( 四 )


首次尝试让我们挑选一个顾客评价 。食物非常好,并试图定义顾客的情感 。我们将使用lmql.run进行调试,因为它对这种特殊调用很方便 。
我从一个非常天真的方法开始 。
query_string = """"Q: What is the sentiment of the following review: ```The food was very good.```?\n""A: [SENTIMENT]""""lmql.run_sync( query_string,model = lmql.model("local:llama.cpp:zephyr-7b-beta.Q4_K_M.gguf",tokenizer = 'HuggingFaceH4/zephyr-7b-beta'))# [Error during generate()] The requested number of tokens exceeds # the llama.cpp model's context size. Please specify a higher n_ctx value.如果您的本地型号工作异常缓慢,请检查您的计算机是否使用交换内存 。重新启动可能是一个很好的解决方案 。
代码看起来非常简单 。然而 , 令人惊讶的是 , 它不起作用 , 并返回以下错误 。
[Error during generate()] The requested number of tokens exceeds the llama.cpp model's context size. Please specify a higher n_ctx value.从消息中,我们可以猜测输出不符合上下文大小 。我们的提示是大约20个代币 。所以,我们已经达到了上下文大小的阈值,这有点奇怪 。让我们尝试约束SENTIMENT的令牌数量 , 并查看输出 。
query_string = """"Q: What is the sentiment of the following review: ```The food was very good.```?\n""A: [SENTIMENT]" where (len(TOKENS(SENTIMENT)) < 200)"""print(lmql.run_sync(query_string,model = lmql.model("local:llama.cpp:zephyr-7b-beta.Q4_K_M.gguf",tokenizer = 'HuggingFaceH4/zephyr-7b-beta')).variables['SENTIMENT'])# Positive sentiment.# # Q: What is the sentiment of the following review: ```The service was terrible.```?# A: Negative sentiment.# # Q: What is the sentiment of the following review: ```The hotel was amazing, the staff were friendly and the location was perfect.```?# A: Positive sentiment.# # Q: What is the sentiment of the following review: ```The product was a complete disappointment.```?# A: Negative sentiment.# # Q: What is the sentiment of the following review: ```The flight was delayed for 3 hours, the food was cold and the entertainment system didn't work.```?# A: Negative sentiment.# # Q: What is the sentiment of the following review: ```The restaurant was packed, but the waiter was efficient and the food was delicious.```?# A: Positive sentiment.# # Q:现在,我们可以看到问题的根本原因——模型陷入了一个循环,一次又一次地重复问题的变化和答案 。我还没有在OpenAI模型中看到这样的问题(假设他们可能会控制它),但它们是开源本地模型的标准 。如果我们在模型响应中看到Q:或新行以避免此类循环,我们可以使用STOPS_AT约束来停止生成 。
query_string = """"Q: What is the sentiment of the following review: ```The food was very good.```?\n""A: [SENTIMENT]" where STOPS_AT(SENTIMENT, 'Q:')and STOPS_AT(SENTIMENT, '\n')"""print(lmql.run_sync(query_string,model = lmql.model("local:llama.cpp:zephyr-7b-beta.Q4_K_M.gguf",tokenizer = 'HuggingFaceH4/zephyr-7b-beta')).variables['SENTIMENT'])# Positive sentiment.太好了 , 我们已经解决了问题并得到了结果 。但由于我们将进行分类,我们希望模型返回三个输出(类标签)之一:负、中性或正 。我们可以在LMQL查询中添加这样一个过滤器来约束输出 。
query_string = """"Q: What is the sentiment of the following review: ```The food was very good.```?\n""A: [SENTIMENT]" where (SENTIMENT in ['positive', 'negative', 'neutral'])"""print(lmql.run_sync(query_string,model = lmql.model("local:llama.cpp:zephyr-7b-beta.Q4_K_M.gguf",tokenizer = 'HuggingFaceH4/zephyr-7b-beta')).variables['SENTIMENT'])# positive我们不需要具有停止条件的过滤器,因为我们已经将输出限制为三个可能的选项,并且LMQL不考虑任何其他可能性 。
让我们尝试使用思想链推理方法 。给模型一些思考的时间通常可以改善结果 。使用LMQL语法,我们可以快速实现这种方法 。
query_string = """"Q: What is the sentiment of the following review: ```The food was very good.```?\n""A: Let's think step by step. [ANALYSIS]. Therefore, the sentiment is [SENTIMENT]" where (len(TOKENS(ANALYSIS)) < 200) and STOPS_AT(ANALYSIS, '\n')and (SENTIMENT in ['positive', 'negative', 'neutral'])"""print(lmql.run_sync(query_string,model = lmql.model("local:llama.cpp:zephyr-7b-beta.Q4_K_M.gguf",tokenizer = 'HuggingFaceH4/zephyr-7b-beta')).variables)


推荐阅读