{'messages': [{'role': 'system', 'content': 'You are a skilled mathematician specializing in LEAN, the powerful theorem prover.'}, {'role': 'user', 'content': 'Please translate the natural language version to a LEAN version:\nNatural language version: If $r$ is rational $(r \\neq 0)$ and $x$ is irrational, prove that $rx$ is irrational.'}, {'role': 'assistant', 'content': 'LEAN version:\ntheorem exercise_1_1b\n(x : ℝ)\n(y : ℚ)\n(h : y ≠ 0)\n: ( irrational x ) -> irrational ( x * y ) :='}]}
最后转换为 LoRAX 的格式:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
lorax_entries = [item['messages'] for item in instruct_pairs] lorax_entries = [ { 'prompt': item[0]['content'] + '\n' + item[1]['content'], 'completion': item[2]['content'] } for item in lorax_entries ]
valid_entries = [item['messages'] for item in valid_pairs] valid_entries = [ { 'prompt': item[0]['content'] + '\n' + item[1]['content'], 'completion': item[2]['content'] } for item in valid_entries ]
pprint(lorax_entries[0])
转换后的数据样例:
1 2 3 4 5 6 7 8 9 10 11
{'completion': 'LEAN version:\n' 'theorem exercise_1_1b\n' '(x : ℝ)\n' '(y : ℚ)\n' '(h : y ≠ 0)\n' ': ( irrational x ) -> irrational ( x * y ) :=', 'prompt': 'You are a skilled mathematician specializing in LEAN, the powerful ' 'theorem prover.\n' 'Please translate the natural language version to a LEAN version:\n' 'Natural language version: If $r$ is rational $(r \\neq 0)$ and $x$ ' 'is irrational, prove that $rx$ is irrational.'}
保存/上传数据
将处理后的数据保存为 JSONL 文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
import json
defwrite_to_jsonl(data, file_path): withopen(file_path, 'w', encoding='utf-8') as file: for entry in data: # 将字典转换为 JSON 格式的字符串并写入文件 json_line = json.dumps(entry) file.write(json_line + '\n') # 添加换行符以符合 JSONL 格式
Successfully requested finetuning of mistral-7b as `proofnet-model-mistral/2`. (Job UUID: e23f9e75-13e4-49d6-bca9-cd70ea9b9d6e).
Watching progress of finetuning job e23f9e75-13e4-49d6-bca9-cd70ea9b9d6e. This call will block until the job has finished. Canceling or terminating this call will NOT cancel or terminate the job itself.
Job is queued for execution. Time in queue: 0:00:01
Adapter proofnet-model-mistral/2 is not yet ready. Watching progress of finetuning job e23f9e75-13e4-49d6-bca9-cd70ea9b9d6e. This call will block until the job has finished. Canceling or terminating this call will NOT cancel or terminate the job itself.
# Start a fine-tuning job with custom parameters, blocks until training is finished adapter = pb.adapters.create( config=FinetuningConfig( base_model="Qwen/Qwen1.5-7B", # hf_token="<YOUR HUGGINGFACE TOKEN>" # Required for private Huggingface models target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], ), dataset=dataset, repo=repo, description="changing epochs, rank, and learning rate" )
效果验证及参数下载
使用微调后的适配器进行推理,并与未微调的模型进行效果比较。
验证集示例:
1 2 3 4 5 6 7 8 9
{'completion': 'LEAN version:\n' 'theorem exercise_1_1a\n' ' (x : ℝ) (y : ℚ) :\n' ' ( irrational x ) -> irrational ( x + y ) :=', 'prompt': 'You are a skilled mathematician specializing in LEAN, the powerful ' 'theorem prover.\n' 'Please translate the natural language version to a LEAN version:\n' 'Natural language version: If $r$ is rational $(r \\neq 0)$ and $x$ ' 'is irrational, prove that $r+x$ is irrational.'}
INST] You are a skilled mathematician specializing in LEAN, the powerful theorem prover. Please translate the natural language version to a LEAN version: Natural language version: If $r$ is rational $(r \neq 0)$ and $x$ is irrational, prove that $r+x$ is irrational. [/INST] 2.
[INST] You are a skilled mathematician specializing in LEAN, the powerful theorem prover. Please translate the natural language version to a LEAN version: Natural language version: If $r$ is rational $(r \neq 0)$ and $x$ is irrational, prove that $r+x$ is irrational. [/INST] 3.
微调输出如下,相比未微调版本,包含部分正确逻辑,但模型知识仍然不够:
1 2
theorem exercise_1_1_11 {r x : ℝ} (hr : r ≠ 0) (hx : x ≠ 0) : irrational (r + x) :=
from predibase import Predibase, FinetuningConfig, DeploymentConfig
api_token = "pb_xxx" tenant_id = "xxxxx"
pb = Predibase(api_token=api_token) # Connected to Predibase as User(id=xxx, username=xxx)
lorax_client = pb.deployments.client("mistral-7b-instruct-v0-2") # 插入部署名称 resp = lorax_client.generate("[INST] What are some popular tourist spots in San Francisco? [/INST]") print(resp.generated_text)
输出如下:
1
San Francisco, California is known for its unique blend of culture, history, and natural beauty. Here are some popular tourist spots that you may want to consider visiting:
使用流式响应来逐步获取模型的输出:
1 2 3
for resp in lorax_client.generate_stream("[INST] What are some popular tourist spots in San Francisco? [/INST]"): ifnot resp.token.special: print(resp.token.text, sep="", end="", flush=True)
chat = Chat() chat.model = ''# 插入适配器名称,空字符串则不加载适配器 chat.user("How many helicopters can a human eat in one sitting?") chat.getresponse() chat.print_log()
输出以下对话日志:
1 2 3 4 5 6 7 8 9
--------------- user --------------- How many helicopters can a human eat in one sitting?
--------------- assistant --------------- It is not possible for a human to eat a helicopter in one sitting, as helicopters are not designed to be consumed by humans. They are aircraft with rotating blades and a main body, not food items.