【编译原理】让我们来构建一个简单的解释器(Let’s Build A Simple Interpreter. Part 1.)(python/c/c++版)(笔记)
生活随笔
收集整理的這篇文章主要介紹了
【编译原理】让我们来构建一个简单的解释器(Let’s Build A Simple Interpreter. Part 1.)(python/c/c++版)(笔记)
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
原文:Let’s Build A Simple Interpreter. Part 1.
文章目錄
- 【編譯原理】讓我們來構(gòu)建一個簡單的解釋器(Let’s Build A Simple Interpreter. Part 1.)(python/c/c++版)(part 1)
- 將輸入字符串分解為token標(biāo)記的過程稱為詞法分析(lexical analysis),詞法分析器( lexical analyzer或 lexer )也叫掃描器(scanner )或標(biāo)記器(tokenizer)。
- 用c/c++實現(xiàn)(版本一:原版)
- 版本二(版本一的升級版,[自認(rèn)為]解決了版本一的一些混亂邏輯)
【編譯原理】讓我們來構(gòu)建一個簡單的解釋器(Let’s Build A Simple Interpreter. Part 1.)(python/c/c++版)(part 1)
pascal代碼,我們要為這種代碼做一個解釋器
program factorial;function factorial(n: integer): longint; beginif n = 0 thenfactorial := 1elsefactorial := n * factorial(n - 1); end;varn: integer;beginfor n := 0 to 16 dowriteln(n, '! = ', factorial(n)); end.首先我們先來實現(xiàn)pascal編譯器的一個小功能,兩個數(shù)的加法,我們準(zhǔn)備用python實現(xiàn),但是如果你想用其他語言實現(xiàn)也是可行的,這是它的代碼:
# -*- coding: utf-8 -*- """ @File : 1.py @Time : 2021/5/19 14:44 @Author : Dontla @Email : sxana@qq.com @Software: PyCharm """ # Token types # # EOF (end-of-file) token is used to indicate that # there is no more input left for lexical analysis INTEGER, PLUS, EOF = 'INTEGER', 'PLUS', 'EOF'class Token(object):def __init__(self, type_, value):# token type: INTEGER, PLUS, or EOFself.type = type_# token value: 0, 1, 2. 3, 4, 5, 6, 7, 8, 9, '+', or Noneself.value = valuedef __str__(self):"""String representation of the class instance.Examples:Token(INTEGER, 3)Token(PLUS '+')"""return 'Token({type}, {value})'.format(type=self.type,value=repr(self.value))def __repr__(self):return self.__str__()class Interpreter(object):def __init__(self, text):# 客戶端字符串輸入, 比如 "3+5"self.text = text# self.pos is an index into self.textself.pos = 0# current token instance(當(dāng)前標(biāo)記實例)self.current_token = Nonedef error(self):raise Exception('Error parsing input') # 語法分析輸入出錯def get_next_token(self):"""Lexical analyzer (also known as scanner or tokenizer)This method is responsible for breaking a sentenceapart into tokens. One token at a time.詞法分析器(也稱為掃描器scanner或標(biāo)記器tokenizer)這個方法負(fù)責(zé)將一個句子分解成標(biāo)記tokens。一次一個標(biāo)記"""text = self.text# is self.pos index past the end of the self.text ?# if so, then return EOF token because there is no more# input left to convert into tokens# self.pos索引是否超過self.text的結(jié)尾?# 如果是,則返回EOF標(biāo)記,因為沒有更多的標(biāo)記# 向左輸入以轉(zhuǎn)換為標(biāo)記if self.pos > len(text) - 1:return Token(EOF, None)# get a character at the position self.pos and decide# what token to create based on the single character# 在self.pos位置獲取一個字符,并根據(jù)單個字符決定要創(chuàng)建的標(biāo)記current_char = text[self.pos]# if the character is a digit then convert it to# integer, create an INTEGER token, increment self.pos# index to point to the next character after the digit,# and return the INTEGER token# 如果字符是數(shù)字,則將其轉(zhuǎn)換為整型,創(chuàng)建整型標(biāo)記,增加self.pos索引以指向數(shù)字后面的下一個字符,然后返回整型標(biāo)記if current_char.isdigit(): # isdigit()函數(shù),全是數(shù)字返回True,否則返回Falsetoken = Token(INTEGER, int(current_char)) # 創(chuàng)建一個tokenself.pos += 1return tokenif current_char == '+':token = Token(PLUS, current_char)self.pos += 1return tokenself.error()def eat(self, token_type):# compare the current token type with the passed token# type and if they match then "eat" the current token# and assign the next token to the self.current_token,# otherwise raise an exception.if self.current_token.type == token_type:self.current_token = self.get_next_token()else:self.error()def expr(self):"""expr -> INTEGER PLUS INTEGER"""# set current token to the first token taken from the inputself.current_token = self.get_next_token()# we expect the current token to be a single-digit integerleft = self.current_tokenself.eat(INTEGER)# we expect the current token to be a '+' tokenop = self.current_tokenself.eat(PLUS)# we expect the current token to be a single-digit integerright = self.current_tokenself.eat(INTEGER)# after the above call the self.current_token is set to# EOF token# at this point INTEGER PLUS INTEGER sequence of tokens# has been successfully found and the method can just# return the result of adding two integers, thus# effectively interpreting client inputresult = left.value + right.valuereturn resultdef main():while True:try:# 要在Python3下運行,請將“raw_input”調(diào)用替換為“input”# text = raw_input('calc> ')text = input('calc> ') # 獲取鍵盤輸入,參數(shù)為提示信息except EOFError: # 不知是什么異常breakif not text:continueinterpreter = Interpreter(text)result = interpreter.expr()print(result)if __name__ == '__main__':main()運行結(jié)果:
D:\python_virtualenv\my_flask\Scripts\python.exe C:/Users/Administrator/Desktop/新建文件夾/1.py calc> 1+2 3 calc>將輸入字符串分解為token標(biāo)記的過程稱為詞法分析(lexical analysis),詞法分析器( lexical analyzer或 lexer )也叫掃描器(scanner )或標(biāo)記器(tokenizer)。
>>> from calc1 import Interpreter >>> >>> interpreter = Interpreter('3+5') >>> interpreter.get_next_token() Token(INTEGER, 3) >>> >>> interpreter.get_next_token() Token(PLUS, '+') >>> >>> interpreter.get_next_token() Token(INTEGER, 5) >>> >>> interpreter.get_next_token() Token(EOF, None) >>>讓我們回顧一下您的解釋器如何評估算術(shù)表達式:
- 解釋器接受一個輸入字符串,比如說“3+5”
- 解釋器調(diào)用expr方法在詞法分析器get_next_token返回的標(biāo)記流中查找結(jié)構(gòu)。它試圖找到的結(jié)構(gòu)是INTEGER PLUS INTEGER的形式。在確認(rèn)結(jié)構(gòu)后,它通過添加兩個INTEGER標(biāo)記的值來解釋輸入,因為此時解釋器很清楚它需要做的是添加兩個整數(shù) 3 和 5。
檢查理解:
什么是解釋器interpreter?
什么是編譯器compiler?
解釋器和編譯器有什么區(qū)別?
什么是標(biāo)記token?
將輸入分解為標(biāo)記的過程的名稱是什么?(詞法分析)
進行詞法分析lexical analysis的解釋器的部分是什么?
解釋器或編譯器的那部分的其他常見名稱是什么?
用c/c++實現(xiàn)(版本一:原版)
#include <stdio.h> #include <stdlib.h> #include <memory.h> #include <string.h>struct Interpreter {char* text;int pos;struct Token (*get_next_token)(struct Interpreter*); };struct Token {int type;char value; };struct Token get_next_token(struct Interpreter* pipt) {if (pipt->pos > (strlen(pipt->text)-1)) {struct Token token = {3, '\0'};//3表示EOF,2表示+,1表示數(shù)字return token;}char current_char = pipt->text[pipt->pos];if (current_char>='0' && current_char<='9') {struct Token token = {1, current_char};pipt->pos++;return token;}if (current_char == '+') {struct Token token = { 2, current_char };pipt->pos++;return token;}printf("輸入非法!\n");exit(-1);//如果都不是以上的字符,則報錯并退出程序 }char eat(struct Token* pcurrent_token, struct Interpreter* pipt, int type) {char former_token_value = pcurrent_token->value;if (pcurrent_token->type == type) {*pcurrent_token = pipt->get_next_token(pipt);}else {printf("輸入非法!\n");exit(-1);}return former_token_value; }int expr(char* text) {struct Interpreter ipt = {text, 0, get_next_token};struct Token current_token = ipt.get_next_token(&ipt);char temp;temp = eat(¤t_token, &ipt, 1);//斷言第一個字符是數(shù)字int left = temp - '0';eat(¤t_token, &ipt, 2);//斷言第三個字符是加號temp = eat(¤t_token, &ipt, 1);//斷言第三個字符是數(shù)字int right = temp - '0';int result = left + right;return result;}int main() {char text[10];while (1){printf("請輸入算式:\n");scanf_s("%s", text, sizeof(text));int result = expr(text);printf("= %d\n\n", result);}return 0; }運行結(jié)果:
請輸入算式: 2+8 = 10請輸入算式: 1+5 = 6請輸入算式: 3+4 = 7請輸入算式:版本二(版本一的升級版,[自認(rèn)為]解決了版本一的一些混亂邏輯)
#include <stdio.h> #include <stdlib.h> #include <memory.h> #include <string.h>#define is_digital 0 #define is_plus 1 #define is_minus 2 #define is_EOF 3struct Token {int type;char value; };struct Interpreter {char* text;int pos;struct Token current_token;//struct Token(*get_next_token)(struct Interpreter*); };void error() {printf("輸入非法!\n");exit(-1); }void get_next_token(struct Interpreter* pipt) {if (pipt->pos > (strlen(pipt->text) - 1)) {pipt->current_token = { is_EOF, '\0' };return;//struct Token token = { is_EOF, '\0' };//3表示EOF,2表示+,1表示數(shù)字//pipt->current_token = token;}char current_char = pipt->text[pipt->pos];if (current_char >= '0' && current_char <= '9') {//struct Token token = { is_digital, current_char };pipt->current_token = { is_digital, current_char };pipt->pos++;return;//pipt->current_token = token;}if (current_char == '+') {//struct Token token = { is_plus , current_char };pipt->current_token = { is_plus, current_char };pipt->pos++;return;//pipt->current_token = token;}if (current_char == '-') {//struct Token token = { is_minus , current_char };pipt->current_token = { is_minus, current_char };pipt->pos++;return;//pipt->current_token = token;}error();//如果都不是以上的字符,則報錯并退出程序 }char eat(struct Interpreter* pipt, int type) {get_next_token(pipt);char token_value = pipt->current_token.value;if (pipt->current_token.type == type) {return token_value;}else {error();} }int expr(char* text) {struct Interpreter ipt = { text, 0 };//get_next_token(&ipt);char temp;temp = eat( &ipt, is_digital);//斷言第一個字符是數(shù)字int left = temp - '0';eat( &ipt, is_plus);//斷言第三個字符是加號temp = eat( &ipt, is_digital);//斷言第三個字符是數(shù)字int right = temp - '0';int result = left + right;return result;}int main() {char text[10];while (1){printf("請輸入算式:\n");scanf_s("%s", text, sizeof(text));int result = expr(text);printf("= %d\n\n", result);}return 0; }運行結(jié)果:
請輸入算式: 3+6 = 9請輸入算式: 8+3 = 11請輸入算式:總結(jié)
以上是生活随笔為你收集整理的【编译原理】让我们来构建一个简单的解释器(Let’s Build A Simple Interpreter. Part 1.)(python/c/c++版)(笔记)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: C语言函数指针(结构体函数指针)
- 下一篇: 【编译原理】让我们来构建一个简单的解释器