秒杀自动编码Copilot!「动嘴编程」神器StarChat开源,码农狂喜( 三 )


sample = dataset["train"][0]print(sample)"messages": ["content": "Is it possible to imagine a society without law?","role": "user",},"content": "It is difficult to imagine a society that is able to be maintained without any semblance of Law. Laws exist for the purpose of maintaining society and past human behavior suggests that there would be people that have no interest in promoting social cohesion and norms so it's unlikely that complex social and economic systems could develop without a legal system.","role": "assistant",},"content": 'It seems like you consider the absence of law equal to the absence of anything that could guide the behaviour of the individual, however there are many other such things, like religion, conscience, tradition to mention a few important patterns which are known to have crucial influence on society even today, but also we could imagine more, like universal love if everyone suddenly became "awaken" in some spiritual sense, or simply just having an omnipotent leader who maintains a certain order without ever codifying the principles in law. Is it still difficult to imagine?',"role": "user",},"content": "You are correct that there are other factors that can guide behavior in a society and play a role in shaping individuals' behavior and interactions with each other. However, even in societies where these factors are present, laws still serve an important role in maintaining social order and resolving conflicts. nnFor example, religious teachings may prohibit theft and murder, but laws provide a system of justice for those who violate these prohibitions. Personal morality and conscience may guide individuals to act in a certain way, but laws provide a framework for resolving disputes when conflicting moral beliefs arise. Furthermore, the other factors you mentioned, such as universal love or an omnipotent leader, are not necessarily stable or reliable guides for behavior. Universal love, while a noble ideal, may not be achievable in practice, and an omnipotent leader may be prone to corruption or abuse of power.nnLaws provide a crucial foundation for maintaining social order and ensuring justice. It is possible to imagine alternative systems of governance, but it is unlikely that a society without any sort of legal framework would be stable or functional in the long term.","role": "assistant",},"content": "Yeah, but laws are complicated. Most people can't understand them in depth. Some would argue it is almost a self-serving system which put energy into growing itself(eg.: patent trolling). I think there must be a less complex system which keeps up order in society.","role": "user",},
 
这看起来是有关道德哲学的有趣对话 。现在,来看看如何将这些对话转换为标准格式,以简化推理时生成消息的方式 。
对话的标准格式
对对话进行微调的一种方法是,在每个训练例子中简单地插入系统信息和角色,然后用一个序列末尾的token来分隔每个对话,如. 。例如,上面的对话可以采取这样的形式:
Below is a dialogue between a human and AI assistant ...
Human: Is it possible to imagine a society without law?Assistant: It is difficult to imagine ...Human: It seems like you ...Assistant: You are correct ...Human: Yeah, but laws are complicated .. 
这一方法,对训练来说效果不错,但对推理来说并不理想 。
因为模型会自然产生不需要的转折,直到产生 token,通常需要一些后处理来预防这种情况 。
一个更吸引人的方法是使用像ChatML这样的结构化格式,它用一组特殊的token来包装每个回合,表明查询或响应的作用 。在这种格式中,我们有以下的特殊标记:<|system|>:表示对话的哪一部分包含了系统信息,以调节助手的角色 。<|user|>:表示该信息来自人类用户 。<|assistant|>:表示信息来自于人工智能助手 。<|end|>:表示一个回合或系统信息的结束 。
接下来,写一个函数,用这些token来包装进行的实例,看看它是什么样子的:
system_token = "<|assistant|>"
user_token = "<|user|>"assistant_token = "<|assistant|>"end_token = "<|end|>"
def prepare_dialogue(example):system_msg = "Below is a dialogue between a human and an AI assistant called StarChat."


推荐阅读