秒杀自动编码Copilot!「动嘴编程」神器StarChat开源,码农狂喜( 五 )


pip install -r requirements.txt 
需要登录到两个Hugging Face 。要做到这一点,请运行:
huggingface-cli login 
最后,用以下方法安装Git LFS:
sudo apt-get install git-lfs 
最终一步是启动训练!如果你足够幸运,有8个A100(80GB)GPU来运行这个摸牌行,你可以运行以下命令 。训练应该需要45分钟左右:
torchrun --nproc_per_node=8 train.py config.yaml --deepspeed=deepspeed_z3_config_bf16.json 
这里的config.yaml文件指定了与数据集、模型和训练相关的所有参数 。你可以在这里进行配置,可以让模型训练适应新的数据集 。然后你的训练模型就可以在Hub上使用了!
编码助手StarCoder诞生
生成图表
研究人员想看看自己的模型如何完成基本的可视化任务,就像GPT-4的Tikz中著名的独角兽图一样 。

秒杀自动编码Copilot!「动嘴编程」神器StarChat开源,码农狂喜

文章插图
为了做到这一点,研究人员用一些编码任务来提示模型,并得到了很好的结果!
不得不承认,这些结果有点偷梁换柱,因为他们只选择了写出正常运行的代码,但其他的也差不了多少 。
式例1:柱状图
提示:
Give the following data {'Name':['Tom', 'Brad', 'Kyle', 'Jerry'], 'Age':[20, 21, 19, 18], 'Height' : [6.1, 5.9, 6.0, 6.1]}. Can you plot one graph with two subplots as columns. The first is a bar graph showing the height of each person. The second is a bargraph showing the age of each person? Draw the graph in seaborn talk mode.
回应:
import pandas as pdimport matplotlib.pyplot as plt%matplotlib inlineplt.style.use('seaborn-talk')
data = https://www.isolves.com/it/cxkf/bk/2023-05-15/{'Name':['Tom', 'Brad', 'Kyle', 'Jerry'],'Age':[20, 21, 19, 18],'Height':[6.1, 5.9, 6.0, 6.1]}
df=pd.DataFrame(data)
Bar Graph for Height and Agefig,(ax1, ax2)=plt.subplots(nrows=1, ncols=2, figsize=(14,7))sns.barplot(x='Name', y="Height", data=https://www.isolves.com/it/cxkf/bk/2023-05-15/df, palette="Blues_d", ax=ax1).set_title("Height Comparison")sns.barplot(x='Name',y ="Age", data=https://www.isolves.com/it/cxkf/bk/2023-05-15/df,palette="Reds_d", ax=ax2 ).set_title("Age Comparison")
# Add padding between plotsfig.tight_layout()
Show the final imageplt.show() 
秒杀自动编码Copilot!「动嘴编程」神器StarChat开源,码农狂喜

文章插图
示例2:绘图
提示:
Draw me a map of the world using geopandas. Make it so that only Germany and Spain are colored red. 
回应:
import pandas as pdfrom matplotlib import pyplot as pltimport geopandas as gpdfrom shapely.geometry import Point, Polygon%matplotlib inlineLoad data from URL or local fileurl = 'https://raw.githubusercontent.com/datasets/geo-countries/master/data/countries.geojson'df = gpd.read_file(url)
Create plotfig, ax = plt.subplots()ax.set_aspect('equal')world = gpd.GeoDataFrame({'name': ['World'],'geometry': [Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])]})germany = df[df['ADMIN'] == "Germany"]spain = df[df['ADMIN'] == "Spain"]other = df[(df['ADMIN']!= "Germany") & (df['ADMIN']!= "Spain")]world.plot(color='lightgrey', edgecolor='white', ax=ax)germany.plot(color="red", ax=ax)spain.plot(color="red", ax=ax)other.plot(color="skyblue", ax=ax)plt.title("European Countries")plt.show() 
秒杀自动编码Copilot!「动嘴编程」神器StarChat开源,码农狂喜

文章插图
示例3:篮球
提示:
There was a basketball game with the following stats. player, points, rebounds and assists: J. Harden 42 8 9 J. Embiid 34 13 4 T. Maxey 14 8 1 T. Harris 9 7 2 P. Tucker 6 3 1. Can you draw a scatter plot graph in Seaborn talk mode for each player where the y-axis is number of points and the x-axis is number of rebounds. Make sure the marker is 'o'. Put the name of the player over each point. The title should be "Sixers Players".


推荐阅读