Python 代码执行策略 | 模型自动化

唠唠闲话

在构建智能体时，模型执行代码的能力能极大提升其实用性和灵活性。通过执行代码，模型与外部环境进行更复杂的交互，继而实现多样的功能扩展。本文将讨论 Python 环境下实现代码执行的两种策略：基于原生表达式的策略，以及基于脚本的通用策略。

基础策略：执行表达式

在 Python 中，exec 函数支持执行动态代码，我们利用这一函数，编写代码执行策略。这种方法的优点是灵活性高，可以直接捕获输出的 Python 对象。

实现一：获取变量空间

获取执行后的全局变量空间，捕获执行过程中产生的新变量。

def exec_python_code(code: str) -> dict:
    """Execute the code and return the namespace or error message.

    Args:
        code (str): 需要执行的代码字符串。
    
    Returns:
        dict: 执行后的命名空间或错误信息。
    """
    try:
        globalspace, currentspace, newspace = globals().copy(), globals().copy(), {}
        exec(code, globalspace)
        for key, val in globalspace.items():
            if key not in currentspace:
                newspace[key] = str(val)
        return newspace
    except Exception as e:
        return {'error': str(e)}

实现二：返回最后一个表达式的结果

将代码中的最后一行视为表达式计算，这种方法适用于需要立即获取结果的场景，如交互式执行或即时计算。

def exec_last_expr(code: str):
    """Execute the code and return the result or error message.

    Args:
        code (str): 需要执行的代码字符串。
    
    Returns:
        任意: 返回最后一个表达式的结果或错误信息。
    """
    local_vars = {}
    # Split the code into lines
    lines = code.strip().split('\n')
    try:
        # Execute all lines except the last one
        exec('\n'.join(lines[:-1]), {}, local_vars)
        # Evaluate the last expression
        result = eval(lines[-1], {}, local_vars)
        return result
    except Exception as e:
        return str(e)

用法示例：`exec_python_code`

简单的变量赋值

code = """
x = 10
y = 20
z = x + y
"""

result = exec_python_code(code)
print(result)

输出

1	{'x': '10', 'y': '20', 'z': '30'}

函数定义和调用

code = """
def add(a, b):
    return a + b

result = add(5, 7)
"""

result = exec_python_code(code)
print(result)

输出

1	{'add': '<function add at 0x....>', 'result': '12'}

循环和条件语句

code = """
result_list = []
for i in range(5):
    if i % 2 == 0:
        result_list.append(i)
"""

result = exec_python_code(code)
print(result)

输出

1	{'result_list': '[0, 2, 4]', 'i': '4'}

捕获错误

code = """
a = 5
b = 0
c = a / b
"""

result = exec_python_code(code)
print(result)

输出

1	{'error': 'division by zero'}

用法示例：`exec_last_expr`

和前边类似，但要在末尾追加希望求解的变量。

简单数学计算：

code = """
a = 10
b = 5
a + b
"""

result = exec_last_expr(code)
print(result)

输出

使用列表和列表推导式：

code = """
numbers = [1, 2, 3, 4, 5]
squared_numbers = [n**2 for n in numbers]
squared_numbers
"""

result = exec_last_expr(code)
print(result)

输出

1	[1, 4, 9, 16, 25]

捕获异常：

code = """
x = 10
y = 'string'
x + y
"""

result = exec_last_expr(code)
print(result)

输出

1	unsupported operand type(s) for +: 'int' and 'str'

SageMath

特别地，我们之前介绍了 SageMath，基于 Python 编写的强大的开源数学软件系统，也支持这些执行策略。

但注意要先用 preparser 模块对代码进行预处理，以便将 Sage 特殊语法转换成标准的 Python 语法，并用于后续计算。以下是几个常见的 SageMath 语法转换示例。

SageMath 使用标记语法 <x,y> 来定义生成器：

import sage.repl.preparse as preparser

# 处理生成器表达式的初始化
print(preparser.preparse_file('L.<x,y> = ob'))

输出

1	L = ob; (x, y,) = L._first_ngens(2)

在 SageMath 中，^ 被用于幂运算，而非 Python 中的异或操作符，异或运算被改写为 ^^。

1 2	# 处理幂运算符 print(preparser.preparse_file('a = b ^ c'))

输出

1	a = b ** c

SageMath 对整数类型做了封装，1/2 不再是浮点除法操作，而保持为有理数：

1 2	# 处理有理数表示 print(preparser.preparse_file('1/2'))

输出

1 2	_sage_const_1 = Integer(1); _sage_const_2 = Integer(2) _sage_const_1 /_sage_const_2

在执行代码前，加一步 preparser.preparse_file 处理即可。

通用策略：执行脚本

除了前边提到的执行表达式的方案，执行脚本是一种简单且通用的方法，适用于不同的编程语言。简单说，就是将代码保存为临时文件并使用 subprocess 模块来运行它，从而可以处理任何可通过命令行执行的脚本语言。

编写代码

代码修改自 Numina 仓库，对错误处理过程做了简化，并以及增加了可选参数。

以下是这个方案的详细实现，其中 PythonREPL 类提供了一个运行 Python 代码的简单接口：

import os
import subprocess
import tempfile
from concurrent.futures import ThreadPoolExecutor
from typing import Tuple

class PythonREPL:
    """Python REPL executor
    
    通过将 Python 代码保存到临时文件中，并使用 subprocess 模块来执行这些代码。
    
    Args:
        timeout (int): The timeout in seconds for the execution of the Python code.
        python_executable (str): The Python executable to be used for the execution.
        header (str): 执行代码时添加的头部代码（例如导入模块）。
        
    Methods:
        execute(query: str) -> Tuple[bool, str]:
            执行给定的 Python 代码字符串，如果最后一行不是打印语句，则自动添加打印。
            返回一个元组，包含成功状态和输出或者错误信息。
        
        __call__(query: str) -> Tuple[bool, str]:
            使得类实例可以像函数一样被调用，利用线程池执行代码，捕获超时并返回结果。
    """
    def __init__(self, timeout=5, python_executable="python3", header=None):
        self.timeout = timeout
        self.header = header or "import math\nimport numpy as np\nimport sympy as sp\n"
        self.python_executable = python_executable

    def execute(self, query: str) -> Tuple[bool, str]:
        """Execute the provided Python code string"""
        # check if the code contains disallowed libraries
        status, output = self.code_check(query)
        if not status:
            return False, output
        query = self.header + query
        query = query.strip().split("\n")
        if "print(" not in query[-1]:
            if "#" in query[-1]:
                query[-1] = query[-1].split("#")[0]
            query[-1] = "print(" + query[-1] + ")"
        query = "\n".join(query)

        with tempfile.TemporaryDirectory() as temp_dir:
            # create a temporary file with the Python code
            temp_file_path = os.path.join(temp_dir, "tmp.py")
            with open(temp_file_path, "w") as f:
                f.write(query)

            # execute the Python code
            result = subprocess.run(
                [self.python_executable, temp_file_path],
                capture_output=True,
                check=False,
                text=True,
                timeout=self.timeout,
            )

            # return the output or error message
            if result.returncode == 0:
                output = result.stdout
                return True, output.strip()
            else:
                msgs = result.stderr.strip().split("\n")
                # return False, '\n'.join(msgs.strip()) # return the full error message
                new_msgs = []
                want_next = False
                for m in msgs[:-1]:
                    # catch the traceback error message
                    if "Traceback" in m:
                        new_msgs.append(m)
                    elif temp_file_path in m:
                        new_msgs.append(m.replace(temp_file_path, "tmp.py"))
                        want_next = True
                    elif want_next:
                        new_msgs.append(m)
                        want_next = False
                new_msgs.append(msgs[-1])
                error_msg = "\n".join(new_msgs).strip()
                return False, error_msg
    
    def code_check(self, query: str) -> Tuple[bool, str]:
        """Check if the code contains disallowed libraries"""
        # skip the check for now
        # disallowed = ["subprocess", "venv"]
        # for lib in disallowed:
        #     if lib in query:
        #         return False, f"Disallowed library '{lib}' found in the code."
        return True, ""

    def __call__(self, query: str) -> Tuple[bool, str]:
        # submit the execution in a thread pool
        with ThreadPoolExecutor() as executor:
            future = executor.submit(self.execute, query)
            try:
                return future.result(timeout=self.timeout)
            except TimeoutError:
                return False, f"Timed out after {self.timeout} seconds."

方案特点：

多语言支持：通过更改 python_executable 参数，可以适用于不同的编程语言。
安全性及隔离：通过创建临时文件并在子进程中执行代码，减少了影响主进程的风险。
灵活性：适用于需要捕获输出的场景，支持自定义代码头部以导入必需的库。

基本示例

首先，我们创建一个 PythonREPL 实例，该实例允许我们在指定的超时时间内执行 Python 代码。

1	repl = PythonREPL(timeout=10)

我们使用 __call__ 方法（通过 repl(...)）来执行代码：

1 2	success, output = repl("x = 10 # set value of x\nx + 5") print(success, output)

输出

True 15

捕获除零错误：当代码中有错误时，PythonREPL 可以捕获这些错误，并返回相应的错误信息。

1 2	success, output = repl("x = 10\nx / 0") print(success, output)

输出

False Traceback (most recent call last):
  File "tmp.py", line 5, in <module>
    print(x / 0)
ZeroDivisionError: division by zero

嵌套调用的错误：当函数嵌套调用失败时，它同样能够捕获完整的堆栈跟踪信息。

query = """
def fetch_area(circle): 
    return circle.area()
class Circle:
    def __init__(self, radius):
        self.radius = radius
# Calling fetch_area with an object that doesn't have an 'area' method
c = Circle(5)
fetch_area(c)
"""
success, output = repl(query)
print(success, output)

输出

False Traceback (most recent call last):
  File "tmp.py", line 14, in <module>
    print(fetch_area(c))
  File "tmp.py", line 6, in fetch_area
    return circle.area()
AttributeError: 'Circle' object has no attribute 'area'

小结

以上，我们主要介绍了运行代码的两种策略：

使用内置函数：exec 提供了直接执行和评估字符串表达式的能力，适合快速原型开发和动态表达式计算。
脚本执行方式：subprocess 结合临时文件的方案允许我们在隔离环境中执行代码，适合多语言支持，可以处理更加复杂的任务和错误处理。

这些方法各有优劣，开发者可以根据具体的应用场景选择最适合的实现方式。