使用 Boost Spirit 解析命令语言

Parsing a command language using Boost Spirit(使用 Boost Spirit 解析命令语言)

本文介绍了使用 Boost Spirit 解析命令语言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为从各种示例中拼凑而成的命令语言构建解析器.我读过 Boost Spirit Qi 和 Lex 文档,我想我了解基础知识,但从我读过的内容来看,我应该避免使用属性并使用 utree.我在 utree 上找到的文档基本上很烂.鉴于下面的代码,我有以下问题:

  1. 如何注释解析器以使用 utree 创建 AST?
  2. 如何在 utree 构建后遍历它,以发现解析的内容?例如对于仅令牌命令,例如 SET DEBUG ON,以及具有值的命令,例如 LOAD "file.ext" 或 SET DRIVE C:
  3. 我想添加一个注释字符!".那么,在那之后我如何忽略所有内容——除非它出现在带引号的字符串中?
  4. 为什么在我提供无效输入时我的错误处理程序没有被调用?
  5. 如何使命令标记不区分大小写,但不更改引用字符串的内容?

    #include #include #include <字符串>#include <向量>#include #define BOOST_SPIRIT_DEBUG#include #include <boostspiritincludephoenix.hpp>#include #include 使用命名空间标准;使用命名空间 boost::spirit;使用 boost::spirit::utree;////命令语法使用的标记//模板 struct command_tokens : lex::lexer <Lexer>{命令令牌():////动词,带缩写(足够使每个字符都独一无二的字符)//引导(B(O(O(T)?)?)?"),退出(E(X(I(T)?)?)?"),帮助(H(E(L(P)?)?)?"),dash_help ("-H(E(L(P)?)?)?"),slash_help ("\/H(E(L(P)?)?)?"),负载(L(O(A(D)?)?)?"),退出(Q(U(I(T)?)?)?"),设置(SE(T)?"),显示(SH(O(W)?)?"),////名词,带缩写(最小字符数通常为3,但可能更多以确保唯一性)//调试(DEB(U(G)?)?"),驱动器(DRI(V(E)?)?"),跟踪(TRA(C(E)?)?"),////限定符//在(开"),关闭(关闭"),////传递回语法的标记//quoted_string ("..."){使用命名空间 boost::spirit::lex;////将标记与词法分析器相关联//这->自己= 开机|出口|帮助|dash_help|斜线帮助|加载|退出|放|展示|调试|驾驶|痕迹|离开|在|引用字符串;////定义要忽略的空格:空格、制表符、换行符//this->self ("WS")= lex::token_def <>("[ \t\n]+");}lex::token_def <>开机;lex::token_def <>破折号帮助;lex::token_def <>调试;lex::token_def <字符串>驾驶;lex::token_def <>出口;lex::token_def <>帮助;lex::token_def <>加载;lex::token_def <>离开;lex::token_def <>在;lex::token_def <>退出;lex::token_def <字符串>引用字符串;lex::token_def <>放;lex::token_def <>展示;lex::token_def <>斜线帮助;lex::token_def <>痕迹;};////显示解析错误//结构体 error_handler_{模板<类型名称,类型名称,类型名称>结构结果{typedef 空类型;};模板空操作符()(qi::info const&什么,迭代器 Err_pos,迭代器最后) 常量{cout<<错误!期待"<<什么<<"这里: ""<<字符串 (Err_pos, Last)<<"""<<结束;}};boost::phoenix::function const error_handler = error_handler_();////描述有效命令的语法//模板<typename Iterator, typename Lexer>struct command_grammar : qi::grammar <Iterator>{模板 command_grammar (command_tokens <Lexer> const& Tok) :command_grammar::base_type(开始){使用 qi::on_error;使用 qi::fail;使用 qi::char_;开始= + 命令;命令= (引导命令|退出命令|帮助命令|加载命令|设置命令|显示命令);引导命令= Tok.boot;退出命令= Tok.exit|Tok.quit;帮助命令= Tok.help|Tok.dash_help|Tok.slash_help;加载命令= Tok.load >>Tok.quoted_string;设置命令= 令牌集;显示命令= Tok.show;设置属性= debug_property|drive_property|跟踪属性;调试属性= Tok.debug >>开关;drive_property= Tok.drive >>char_ ("A-Z") >>char_(":");trace_property= Tok.trace >>开关;开关= Tok.on|Tok.off;BOOST_SPIRIT_DEBUG_NODE(启动);BOOST_SPIRIT_DEBUG_NODE(命令);BOOST_SPIRIT_DEBUG_NODE(引导命令);BOOST_SPIRIT_DEBUG_NODE(退出命令);BOOST_SPIRIT_DEBUG_NODE (help_command);BOOST_SPIRIT_DEBUG_NODE (load_command);BOOST_SPIRIT_DEBUG_NODE(退出命令);BOOST_SPIRIT_DEBUG_NODE (set_command);BOOST_SPIRIT_DEBUG_NODE (show_command);BOOST_SPIRIT_DEBUG_NODE (set_property);BOOST_SPIRIT_DEBUG_NODE (debug_property);BOOST_SPIRIT_DEBUG_NODE (drive_property);BOOST_SPIRIT_DEBUG_NODE (trace_property);BOOST_SPIRIT_DEBUG_NODE (target_property);on_error <失败>(开始, error_handler (_4, _3, _2));}qi::rule <迭代器>开始;qi::rule <迭代器>命令;qi::rule <迭代器>引导命令;qi::rule <迭代器>退出命令;qi::rule <迭代器>帮助命令;qi::rule <迭代器>加载命令;qi::rule <迭代器>退出命令;qi::rule <迭代器>设置命令;qi::rule <迭代器>显示命令;qi::rule <迭代器>设置属性;qi::rule <迭代器>调试属性;qi::rule <Iterator, string()>驱动器属性;qi::rule <迭代器>目标属性;qi::rule <迭代器>跟踪属性;qi::rule <迭代器>开关;};整数主要的(内部Argc,PCHAR 参数){typedef std::string::iterator base_iterator_type;typedef lex::lexertl::token 令牌类型;typedef lex::lexertl::lexer 词法分析器类型;typedef command_tokens 命令令牌;typedef command_tokens::iterator_type iterator_type;typedef command_grammar <iterator_type, command_tokens::lexer_def>命令语法;command_tokens 令牌;command_grammar 命令(令牌);字符串输入 = "设置驱动器 C:";string::iterator it = input.begin();iterator_type iter = tokens.begin(it, input.end());iterator_type end = tokens.end();字符串 ws ("WS");bool result = lex::tokenize_and_phrase_parse(it, input.end(), tokens, commands, qi::in_state (ws) [tokens.self]);如果(结果){cout<<解析成功"<<结束;}别的{字符串休息(它,input.end());cout<<解析失败"<<结束;cout<<停在"<<休息<<结束;}返回0;}//主线程结束

解决方案

我将避开你的大部分代码,原因很简单,经验告诉我 Lexutree 一般不是你想用的.

您想要的是定义一个 AST 来表示您的命令语言,然后提出一个语法来构建它.

AST

命名空间 Ast {结构无值{bool operator==(NoValue const &) const { return true;}};模板 结构通用命令{};命名空间标签{结构引导;结构帮助;结构负载;结构体退出;结构集;结构显示;};模板 <>struct GenericCommand{ std::string 名称;};模板 <>struct GenericCommand{std::string 属性;boost::variant价值;//选修的};使用 BootCmd = GenericCommand;使用 HelpCmd = GenericCommand;使用 ExitCmd = GenericCommand;使用 ShowCmd = GenericCommand;使用 LoadCmd = GenericCommand;使用 SetCmd = GenericCommand;using Command = boost::variant;使用命令 = std::list<命令>;}

完整代码仅添加调试输出帮助程序.这是完整的融合改编版:

BOOST_FUSION_ADAPT_TPL_STRUCT((Tag), (Ast::GenericCommand) (Tag), )BOOST_FUSION_ADAPT_STRUCT(Ast::LoadCmd,名称)BOOST_FUSION_ADAPT_STRUCT(Ast::SetCmd,属性,值)

语法

在这里我做出一些选择:

  • 让我们让空白和大小写不敏感,允许行分隔命令:(另见 提升精神船长问题)

    start = skip(blank) [lazy_command % eol];

  • 让我们使用 Nabialek Trick将命令与前缀相关联.我使用了一段非常简单的代码来生成唯一的前缀:

    std::setconst 动词 { "boot", "exit", "help", "-help", "/help", "load", "quit", "set", "show", };for (auto const full : 动词)for (auto partial=full; partial.length(); partial.resize(partial.size()-1)) {auto n = std::distance(verbs.lower_bound(partial), verbs.upper_bound(full));如果 (n <2) std::cout <<"("" << 部分 << "", &" << 完整 << "_command)
    ";}

  • 你可以对属性做同样的事情,但我认为当前的设置更简单:

template struct command_grammar : qi::grammar{command_grammar() : command_grammar::base_type(start) {使用命名空间qi;开始 = 跳过(空白)[lazy_command % eol];//纳比亚莱克技巧lazy_command = no_case [ 命令 [ _a = _1 ] >懒惰(*_a)[_val = _1]];on_off.add("on", true)("off", false);命令.add("-help", &help_command) ("-hel", &help_command) ("-he", &help_command) ("-h", &help_command)("/help", &help_command) ("/hel", &help_command) ("/he", &help_command) ("/h", &help_command)("help", &help_command) ("hel", &help_command) ("he", &help_command) ("h", &help_command)("boot", &boot_command) ("boo", &boot_command) ("bo", &boot_command) ("b", &boot_command)("exit", &exit_command) ("exi", &exit_command) ("ex", &exit_command) ("e", &exit_command)("quit", &exit_command) ("qui", &exit_command) ("qu", &exit_command) ("q", &exit_command)("load", &load_command) ("loa", &load_command) ("lo", &load_command) ("l", &load_command)("set", &set_command) ("se", &set_command)("show", &show_command) ("sho", &show_command) ("sh", &show_command);quoted_string = '"' >> +~char_('"') >>'"';//空命令boot_command_ = eps;exit_command_ = eps;help_command_ = eps;show_command_ = 每股收益;//非空命令load_command_ =quoted_string;drive_ = char_("A-Z") >>':';set_command_ = no_case[lit("drive")|"drive"|"dri"|"dr"] >>attr("驱动器") >>驾驶_|no_case[ (lit("debug")|"debu"|"deb"|"de") >>attr("调试") >>开关 ]|no_case[ (lit("trace")|"trac"|"tra"|"tr"|"t") >>attr("TRACE") >>开关 ];BOOST_SPIRIT_DEBUG_NODES((开始)(懒惰命令)(boot_command) (exit_command) (help_command) (show_command) (set_command) (load_command)(boot_command_)(exit_command_)(help_command_)(show_command_)(set_command_)(load_command_)(quoted_string)(drive_))on_error(start, error_handler_(_4, _3, _2));on_error(lazy_command, error_handler_(_4, _3, _2));boot_command = boot_command_;exit_command = exit_command_;help_command = help_command_;加载命令 = 加载命令_;exit_command = exit_command_;set_command = set_command_;show_command = show_command_;}私人的:结构 error_handler_t {模板 <typename...>结构结果 { typedef void type;};void operator()(qi::info const &What, Iterator Err_pos, Iterator Last) const {std::cout <<错误!期待" <<什么<<" 这里: "" << std::string(Err_pos, Last) << """ <<std::endl;}};boost::phoenix::functionconst error_handler_ = error_handler_t {};qi::rule<Iterator, Ast::Commands()>开始;使用船长 = qi::blank_type;使用 CommandRule = qi::rule;qi::symbols开关;qi::symbols命令;qi::rule<Iterator, std::string()>驱动器属性,引用字符串,驱动器_;qi::rule>懒惰命令;命令规则 boot_command、exit_command、help_command、load_command、set_command、show_command;qi::rule<Iterator, Ast::BootCmd(), Skipper>引导命令_;qi::rule<Iterator, Ast::ExitCmd(), Skipper>退出命令_;qi::rule<Iterator, Ast::HelpCmd(), Skipper>help_command_;qi::rule<Iterator, Ast::LoadCmd(), Skipper>加载命令_;qi::rule<Iterator, Ast::SetCmd(), Skipper>设置命令_;qi::rule<Iterator, Ast::ShowCmd(), Skipper>显示命令_;};

测试用例

生活在 Coliru

int main() {typedef std::string::const_iterator 它;command_grammarconst 命令;for (std::string const input : {帮助","设置驱动器 C:","设置驱动器 C:","负载"XYZ"","加载"任何完全"",//多行"load "ABC"
help
-he
/H
sh
se t off
se debug ON
b
q"}){std::cout <<----- '" <<输入<<"' -----
";它 f = input.begin(), l = input.end();Ast::Commands 解析;bool result = parse(f, l, commands, parsed);如果(结果){for (auto& cmd : 解析) {std::cout <<已解析" <<cmd<<"
";}} 别的 {std::cout <<"解析失败
";}如果(f != l){std::cout <<剩余未解析的"<<std::string(f, l) <<"'
";}}}

打印:

----- 'help' -----解析帮助 ()----- '设置驱动器 C:' -----解析的 SET(驱动器 C)----- '设置驱动器 C:' -----解析的 SET(驱动器 C)----- '负载XYZ"' -----解析负载 (XYZ)----- '加载任何东西根本"' -----解析的负载(任何根本)----- '加载ABC"帮助-他/H嘘出发se 调试开启乙q' -----解析负载 (ABC)解析帮助 ()解析帮助 ()解析帮助 ()解析后的 SHOW ()解析的 SET (TRACE 0)解析的 SET (DEBUG 1)解析 BOOT ()解析退出 ()

完整列表

生活在 Coliru

//#define BOOST_SPIRIT_DEBUG#include #include #include #include 命名空间 qi = boost::spirit::qi;命名空间 Ast {结构无值{bool operator==(NoValue const &) const { return true;}朋友 std::ostream&运算符<<(std::ostream& os, NoValue) { return os;}};模板 结构通用命令{};命名空间标签{结构引导{};结构帮助{};结构负载{};结构退出{};结构集{};结构显示{};静态 std::ostream&操作符<<(std::ostream& os, boot) { return os <<"启动";}静态 std::ostream&operator<<(std::ostream& os, help) { return os <<帮助";}静态 std::ostream&operator<<(std::ostream& os, load) { return os <<加载";}静态 std::ostream&operator<<(std::ostream& os, exit) { return os <<出口";}静态 std::ostream&operator<<(std::ostream& os, set ) { return os <<放";}静态 std::ostream&operator<<(std::ostream& os, show) { return os <<展示";}};模板 <>struct GenericCommand{ std::string 名称;};模板 <>struct GenericCommand{std::string 属性;boost::variant价值;//选修的};使用 BootCmd = GenericCommand;使用 HelpCmd = GenericCommand;使用 ExitCmd = GenericCommand;使用 ShowCmd = GenericCommand;使用 LoadCmd = GenericCommand;使用 SetCmd = GenericCommand;using Command = boost::variant;使用命令 = std::list<命令>;模板 静态内联 std::ostream&运算符<<(std::ostream&os, Ast::GenericCommandconst&命令){返回操作系统<<标签{} <<" " <<boost::fusion::as_vector(命令);}}BOOST_FUSION_ADAPT_TPL_STRUCT((标签),(Ast::GenericCommand)(标签),)BOOST_FUSION_ADAPT_STRUCT(Ast::LoadCmd,名称)BOOST_FUSION_ADAPT_STRUCT(Ast::SetCmd,属性,值)模板struct command_grammar : qi::grammar{command_grammar() : command_grammar::base_type(start) {使用命名空间qi;开始 = 跳过(空白)[lazy_command % eol];//纳比亚莱克技巧lazy_command = no_case [ 命令 [ _a = _1 ] >懒惰(*_a)[_val = _1]];on_off.add("on", true)("off", false);命令.add("-help", &help_command) ("-hel", &help_command) ("-he", &help_command) ("-h", &help_command)("/help", &help_command) ("/hel", &help_command) ("/he", &help_command) ("/h", &help_command)("help", &help_command) ("hel", &help_command) ("he", &help_command) ("h", &help_command)("boot", &boot_command) ("boo", &boot_command) ("bo", &boot_command) ("b", &boot_command)("exit", &exit_command) ("exi", &exit_command) ("ex", &exit_command) ("e", &exit_command)("quit", &exit_command) ("qui", &exit_command) ("qu", &exit_command) ("q", &exit_command)("load", &load_command) ("loa", &load_command) ("lo", &load_command) ("l", &load_command)("set", &set_command) ("se", &set_command)("show", &show_command) ("sho", &show_command) ("sh", &show_command);quoted_string = '"' >> +~char_('"') >>'"';//空命令boot_command_ = eps;exit_command_ = eps;help_command_ = eps;show_command_ = 每股收益;//非空命令load_command_ =quoted_string;drive_ = char_("A-Z") >>':';set_command_ = no_case[lit("drive")|"drive"|"dri"|"dr"] >>attr("驱动器") >>驾驶_|no_case[ (lit("debug")|"debu"|"deb"|"de") >>attr("调试") >>开关 ]|no_case[ (lit("trace")|"trac"|"tra"|"tr"|"t") >>attr("TRACE") >>开关 ];BOOST_SPIRIT_DEBUG_NODES((开始)(懒惰命令)(boot_command) (exit_command) (help_command) (show_command) (set_command) (load_command)(boot_command_)(exit_command_)(help_command_)(show_command_)(set_command_)(load_command_)(quoted_string)(drive_))on_error(start, error_handler_(_4, _3, _2));on_error(lazy_command, error_handler_(_4, _3, _2));boot_command = boot_command_;exit_command = exit_command_;help_command = help_command_;加载命令 = 加载命令_;exit_command = exit_command_;set_command = set_command_;show_command = show_command_;}私人的:结构 error_handler_t {模板 <typename...>结构结果 { typedef void type;};void operator()(qi::info const &What, Iterator Err_pos, Iterator Last) const {std::cout <<错误!期待" <<什么<<" 这里: "" << std::string(Err_pos, Last) << """ <<std::endl;}};boost::phoenix::functionconst error_handler_ = error_handler_t {};qi::rule<Iterator, Ast::Commands()>开始;使用船长 = qi::blank_type;使用 CommandRule = qi::rule;qi::symbols开关;qi::symbols命令;qi::rule<Iterator, std::string()>驱动器属性,引用字符串,驱动器_;qi::rule>懒惰命令;命令规则 boot_command、exit_command、help_command、load_command、set_command、show_command;qi::rule<Iterator, Ast::BootCmd(), Skipper>引导命令_;qi::rule<Iterator, Ast::ExitCmd(), Skipper>退出命令_;qi::rule<Iterator, Ast::HelpCmd(), Skipper>help_command_;qi::rule<Iterator, Ast::LoadCmd(), Skipper>加载命令_;qi::rule<Iterator, Ast::SetCmd(), Skipper>设置命令_;qi::rule<Iterator, Ast::ShowCmd(), Skipper>显示命令_;};int main() {typedef std::string::const_iterator 它;command_grammarconst 命令;for (std::string const input : {帮助","设置驱动器 C:","设置驱动器 C:","负载"XYZ"","加载"任何完全"",//多行"load "ABC"
help
-he
/H
sh
se t off
se debug ON
b
q"}){std::cout <<----- '" <<输入<<"' -----
";它 f = input.begin(), l = input.end();Ast::Commands 解析;bool result = parse(f, l, commands, parsed);如果(结果){for (auto& cmd : 解析) {std::cout <<已解析" <<cmd<<"
";}} 别的 {std::cout <<"解析失败
";}如果(f != l){std::cout <<剩余未解析的"<<std::string(f, l) <<"'
";}}}

后脚本

<块引用>

问.如何注释解析器以使用 utree 创建 AST?

  • 见上文
<块引用>

问.我如何在 utree 构建后遍历它,以发现解析的内容?

  • 见上文,另见http://www.boost.org/doc/libs/1_64_0/doc/html/variant/tutorial.html
<块引用>

问.我想添加一个注释字符!".那么,在那之后我怎么能忽略所有内容——除非它出现在带引号的字符串中?

  • 简单地让 Skipper 输入一个解析规则,例如:

    qi::rule我的船长;my_skipper = 空白 |'!>>*(char_ - eol) >>(eol|eoi);

    然后用它代替 skip(blank)skip(my_skipper)

<块引用>

问.为什么当我提供无效输入时我的错误处理程序没有被调用?

  • 因为您没有标记期望点(operator> 而不是 operator>>).如果不这样做,则无法匹配子表达式只会回溯.
<块引用>

问.如何使命令标记不区分大小写,但不更改引用字符串的内容?

  • 见上文

I am building a parser for a command language that I've pieced together from various samples. I've read the Boost Spirit Qi and Lex docs, and I think I understand the basics, but from what I've read, I should avoid attributes and use utree. What docs I've found on utree basically suck. Given the code below, I have the following questions:

  1. How do I annotate the parser to create an AST using utree?
  2. How do I walk the utree after it is built, to discover what was parsed? e.g. for token-only commands, such as SET DEBUG ON, as well as commands with values, such as LOAD "file.ext" or SET DRIVE C:
  3. I want to add a comment character, "!". So, how can I ignore everything after that - except when it occurs in a quoted string?
  4. Why doesn't my error handler get called when I give it invalid input?
  5. How can I make the command tokens case insensitive, but not change the contents of a quoted string?

    #include <Windows.h>
    #include <conio.h>
    #include <string>
    #include <vector>
    #include <iostream>
    
    #define BOOST_SPIRIT_DEBUG
    
    #include <boostspiritincludeqi.hpp>
    #include <boostspiritincludephoenix.hpp>
    #include <boostspiritincludelex.hpp>
    #include <boostspiritincludelex_lexertl.hpp>
    
    using namespace std;
    using namespace boost::spirit;
    using boost::spirit::utree;
    
    //
    // Tokens used by the command grammar
    //
    
    template <typename Lexer>
    struct command_tokens : lex::lexer <Lexer>
        {
        command_tokens () :
    
            //
            // Verbs, with abbreviation (just enough characters to make each unique)
            //
    
            boot        ("B(O(O(T)?)?)?"),
            exit        ("E(X(I(T)?)?)?"),
            help        ("H(E(L(P)?)?)?"),
            dash_help   ("-H(E(L(P)?)?)?"),
            slash_help  ("\/H(E(L(P)?)?)?"),
            load        ("L(O(A(D)?)?)?"),
            quit        ("Q(U(I(T)?)?)?"),
            set         ("SE(T)?"),
            show        ("SH(O(W)?)?"),
    
            //
            // Nouns, with abbreviation (the minimum number of characters is usually 3, but may be more to ensure uniqueness)
            //
    
            debug       ("DEB(U(G)?)?"),
            drive       ("DRI(V(E)?)?"),
            trace       ("TRA(C(E)?)?"),
    
            //
            // Qualifiers
            //
    
            on          ("ON"),
            off         ("OFF"),
    
            //
            // Tokens to pass back to the grammar
            //
    
            quoted_string   ("...")
    
            {
            using namespace boost::spirit::lex;
    
            //
            // Associate the tokens with the lexer
            //
    
            this->self 
                = boot
                | exit
                | help
                | dash_help
                | slash_help
                | load
                | quit
                | set
                | show
                | debug
                | drive
                | trace
                | off
                | on
                | quoted_string
                ;
    
            //
            // Define whitespace to ignore: space, tab, newline
            //
    
            this->self ("WS")
                = lex::token_def <> ("[ \t\n]+")
                ;
            }
    
        lex::token_def <>   boot;
        lex::token_def <>   dash_help;
        lex::token_def <>   debug;
        lex::token_def <string> drive;
        lex::token_def <>   exit;
        lex::token_def <>   help;
        lex::token_def <>   load;
        lex::token_def <>   off;
        lex::token_def <>   on;
        lex::token_def <>   quit;
        lex::token_def <string> quoted_string;
        lex::token_def <>   set;
        lex::token_def <>   show;
        lex::token_def <>   slash_help;
        lex::token_def <>   trace;
        };
    
    //
    // Display parse error
    //
    
    struct error_handler_
        {
        template <typename, typename, typename>
        struct result
            {
            typedef void type;
            };
    
        template <typename Iterator>
        void operator ()
            (
            qi::info const& What,
            Iterator        Err_pos,
            Iterator        Last
            ) const
    
            {
            cout << "Error! Expecting "
                << What
                << " here: ""
                << string (Err_pos, Last)
                << """
                << endl;
            }
        };
    
    boost::phoenix::function <error_handler_> const error_handler = error_handler_ ();
    
    //
    // Grammar describing the valid commands
    //
    
    template <typename Iterator, typename Lexer>
    struct command_grammar : qi::grammar <Iterator>
        {
        template <typename Lexer>
        command_grammar (command_tokens <Lexer> const& Tok) :
            command_grammar::base_type (start)
            {
            using qi::on_error;
            using qi::fail;
            using qi::char_;
    
            start
                = +commands;
    
            commands
                = (
                  boot_command
                | exit_command
                | help_command
                | load_command
                | set_command
                | show_command
                );
    
            boot_command
                = Tok.boot;
    
            exit_command
                = Tok.exit
                | Tok.quit;
    
            help_command
                = Tok.help
                | Tok.dash_help
                | Tok.slash_help;
    
            load_command
                = Tok.load >> Tok.quoted_string;
    
            set_command
                = Tok.set;
    
            show_command
                = Tok.show;
    
            set_property
                = debug_property
                | drive_property
                | trace_property;
    
            debug_property
                = Tok.debug >> on_off;
    
           drive_property
                = Tok.drive >> char_ ("A-Z") >> char_ (":");
    
            trace_property
                = Tok.trace >> on_off;
    
            on_off
                = Tok.on
                | Tok.off;
    
            BOOST_SPIRIT_DEBUG_NODE (start);
            BOOST_SPIRIT_DEBUG_NODE (commands);
            BOOST_SPIRIT_DEBUG_NODE (boot_command);
            BOOST_SPIRIT_DEBUG_NODE (exit_command);
            BOOST_SPIRIT_DEBUG_NODE (help_command);
            BOOST_SPIRIT_DEBUG_NODE (load_command);
            BOOST_SPIRIT_DEBUG_NODE (quit_command);
            BOOST_SPIRIT_DEBUG_NODE (set_command);
            BOOST_SPIRIT_DEBUG_NODE (show_command);
            BOOST_SPIRIT_DEBUG_NODE (set_property);
            BOOST_SPIRIT_DEBUG_NODE (debug_property);
            BOOST_SPIRIT_DEBUG_NODE (drive_property);
            BOOST_SPIRIT_DEBUG_NODE (trace_property);
            BOOST_SPIRIT_DEBUG_NODE (target_property);
            on_error <fail> (start, error_handler (_4, _3, _2));
            }
    
        qi::rule <Iterator> start;
        qi::rule <Iterator> commands;
        qi::rule <Iterator> boot_command;
        qi::rule <Iterator> exit_command;
        qi::rule <Iterator> help_command;
        qi::rule <Iterator> load_command;
        qi::rule <Iterator> quit_command;
        qi::rule <Iterator> set_command;
        qi::rule <Iterator> show_command;
        qi::rule <Iterator> set_property;
        qi::rule <Iterator> debug_property;
        qi::rule <Iterator, string ()>  drive_property;
        qi::rule <Iterator> target_property;
        qi::rule <Iterator> trace_property;
        qi::rule <Iterator> on_off;
        };
    
    int
    main
        (
        int     Argc,
        PCHAR   Argv
        )
    {
        typedef std::string::iterator                       base_iterator_type;
        typedef lex::lexertl::token <base_iterator_type>    token_type;
        typedef lex::lexertl::lexer <token_type>            lexer_type;
        typedef command_tokens <lexer_type>                 command_tokens;
        typedef command_tokens::iterator_type               iterator_type;
        typedef command_grammar <iterator_type, command_tokens::lexer_def>  command_grammar;
    
        command_tokens      tokens;
        command_grammar     commands (tokens);
        string              input = "SET DRIVE C:";
        string::iterator    it = input.begin ();
        iterator_type       iter = tokens.begin (it, input.end ());
        iterator_type       end = tokens.end ();
        string              ws ("WS");
    
        bool                result = lex::tokenize_and_phrase_parse (it, input.end (), tokens, commands, qi::in_state (ws) [tokens.self]);
    
        if (result)
            {
            cout << "Parse succeeded" << endl;
            }
        else
            {
            string  rest (it, input.end ());
            cout << "Parse failed" << endl;
            cout << "Stopped at " << rest << endl;
            }
    
        return 0;
    }                           // End of main
    

解决方案

I'm going to side-step the majority of your code, for the simple reasons that experience tells me that Lex and utree are generally not what you want to use.

What you do want is define an AST to represent your command language and then come up with a grammar to build it.

AST

namespace Ast {
    struct NoValue {
        bool operator==(NoValue const &) const { return true; }
    };
    template <typename Tag> struct GenericCommand {};

    namespace tag {
        struct boot;
        struct help;
        struct load;
        struct exit;
        struct set;
        struct show;
    };

    template <> struct GenericCommand<tag::load> { std::string name; };

    template <> struct GenericCommand<tag::set> {
        std::string property;
        boost::variant<NoValue, std::string, bool> value; // optional
    };

    using BootCmd = GenericCommand<tag::boot>;
    using HelpCmd = GenericCommand<tag::help>;
    using ExitCmd = GenericCommand<tag::exit>;
    using ShowCmd = GenericCommand<tag::show>;
    using LoadCmd = GenericCommand<tag::load>;
    using SetCmd  = GenericCommand<tag::set>;

    using Command = boost::variant<BootCmd, HelpCmd, ExitCmd, ShowCmd, LoadCmd, SetCmd>;
    using Commands = std::list<Command>;
}

The full code only adds debug output helpers. And here's the full Fusion Adaption:

BOOST_FUSION_ADAPT_TPL_STRUCT((Tag), (Ast::GenericCommand) (Tag), )
BOOST_FUSION_ADAPT_STRUCT(Ast::LoadCmd, name)
BOOST_FUSION_ADAPT_STRUCT(Ast::SetCmd, property, value)

Grammar

Here I make some choices:

  • let's make things white-space and case insensitive, allowing line-separated commands: (see also Boost spirit skipper issues)

    start = skip(blank) [lazy_command % eol];
    

  • let's use Nabialek Trick to associate commands with prefixes. I used a very simple snippet of code to generate the unique prefixes:

    std::set<std::string> const verbs { "boot", "exit", "help", "-help", "/help", "load", "quit", "set", "show", };
    for (auto const full : verbs)
        for (auto partial=full; partial.length(); partial.resize(partial.size()-1)) {
            auto n = std::distance(verbs.lower_bound(partial), verbs.upper_bound(full));
            if (n < 2) std::cout << "("" << partial << "", &" << full << "_command)
    ";
        }
    

  • you could do the same for properties, but I thought the current setup is simpler:

template <typename Iterator>
struct command_grammar : qi::grammar<Iterator, Ast::Commands()> {
    command_grammar() : command_grammar::base_type(start) {
        using namespace qi;

        start = skip(blank) [lazy_command % eol];

        // nabialek trick
        lazy_command = no_case [ commands [ _a = _1 ] > lazy(*_a) [ _val = _1 ] ];

        on_off.add("on", true)("off", false);

        commands.add
            ("-help", &help_command) ("-hel", &help_command) ("-he", &help_command) ("-h", &help_command)
            ("/help", &help_command) ("/hel", &help_command) ("/he", &help_command) ("/h", &help_command)
            ("help", &help_command) ("hel", &help_command) ("he", &help_command) ("h", &help_command)
            ("boot", &boot_command) ("boo", &boot_command) ("bo", &boot_command) ("b", &boot_command)
            ("exit", &exit_command) ("exi", &exit_command) ("ex", &exit_command) ("e", &exit_command)
            ("quit", &exit_command) ("qui", &exit_command) ("qu", &exit_command) ("q", &exit_command)
            ("load", &load_command) ("loa", &load_command) ("lo", &load_command) ("l", &load_command)
            ("set", &set_command) ("se", &set_command)
            ("show", &show_command) ("sho", &show_command) ("sh", &show_command);

        quoted_string = '"' >> +~char_('"') >> '"';

        // nullary commands
        boot_command_ = eps;
        exit_command_ = eps;
        help_command_ = eps;
        show_command_ = eps;

        // non-nullary commands
        load_command_ = quoted_string;
        drive_        = char_("A-Z") >> ':';
        set_command_  = no_case[lit("drive")|"driv"|"dri"|"dr"] >> attr("DRIVE") >> drive_
                | no_case[ (lit("debug")|"debu"|"deb"|"de")     >> attr("DEBUG") >> on_off ]
                | no_case[ (lit("trace")|"trac"|"tra"|"tr"|"t") >> attr("TRACE") >> on_off ]
                ;

        BOOST_SPIRIT_DEBUG_NODES(
                (start)(lazy_command)
                (boot_command) (exit_command) (help_command) (show_command) (set_command) (load_command)
                (boot_command_)(exit_command_)(help_command_)(show_command_)(set_command_)(load_command_)
                (quoted_string)(drive_)
            )

        on_error<fail>(start, error_handler_(_4, _3, _2));
        on_error<fail>(lazy_command, error_handler_(_4, _3, _2));
        boot_command = boot_command_;
        exit_command = exit_command_;
        help_command = help_command_;
        load_command = load_command_;
        exit_command = exit_command_;
        set_command  = set_command_;
        show_command = show_command_;
    }

  private:
    struct error_handler_t {
        template <typename...> struct result { typedef void type; };

        void operator()(qi::info const &What, Iterator Err_pos, Iterator Last) const {
            std::cout << "Error! Expecting " << What << " here: "" << std::string(Err_pos, Last) << """ << std::endl;
        }
    };

    boost::phoenix::function<error_handler_t> const error_handler_ = error_handler_t {};

    qi::rule<Iterator, Ast::Commands()> start;

    using Skipper = qi::blank_type;
    using CommandRule  = qi::rule<Iterator, Ast::Command(), Skipper>;

    qi::symbols<char, bool> on_off;
    qi::symbols<char, CommandRule const*> commands;

    qi::rule<Iterator, std::string()> drive_property, quoted_string, drive_;

    qi::rule<Iterator, Ast::Command(), Skipper, qi::locals<CommandRule const*> > lazy_command;
    CommandRule boot_command, exit_command, help_command, load_command, set_command, show_command;

    qi::rule<Iterator, Ast::BootCmd(), Skipper> boot_command_;
    qi::rule<Iterator, Ast::ExitCmd(), Skipper> exit_command_;
    qi::rule<Iterator, Ast::HelpCmd(), Skipper> help_command_;
    qi::rule<Iterator, Ast::LoadCmd(), Skipper> load_command_;
    qi::rule<Iterator, Ast::SetCmd(),  Skipper> set_command_;
    qi::rule<Iterator, Ast::ShowCmd(), Skipper> show_command_;
};

Test Cases

Live On Coliru

int main() {
    typedef std::string::const_iterator It;
    command_grammar<It> const commands;

    for (std::string const input : {
            "help",
            "set drive C:",
            "SET DRIVE C:",
            "loAD "XYZ"",
            "load "anything 
at all"",
            // multiline
            "load "ABC"
help
-he
/H
sh
se t off
se debug ON
b
q"
            })
    {
        std::cout << "----- '" << input << "' -----
";
        It f = input.begin(), l = input.end();

        Ast::Commands parsed;
        bool result = parse(f, l, commands, parsed);

        if (result) {
            for (auto& cmd : parsed) {
                std::cout << "Parsed " << cmd << "
";
            }
        } else {
            std::cout << "Parse failed
";
        }

        if (f != l) {
            std::cout << "Remaining unparsed '" << std::string(f, l) << "'
";
        }
    }
}

Prints:

----- 'help' -----
Parsed HELP ()
----- 'set drive C:' -----
Parsed SET (DRIVE C)
----- 'SET DRIVE C:' -----
Parsed SET (DRIVE C)
----- 'loAD "XYZ"' -----
Parsed LOAD (XYZ)
----- 'load "anything 
at all"' -----
Parsed LOAD (anything 
at all)
----- 'load "ABC"
help
-he
/H
sh
se t off
se debug ON
b
q' -----
Parsed LOAD (ABC)
Parsed HELP ()
Parsed HELP ()
Parsed HELP ()
Parsed SHOW ()
Parsed SET (TRACE 0)
Parsed SET (DEBUG 1)
Parsed BOOT ()
Parsed EXIT ()

Full Listing

Live On Coliru

//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/include/io.hpp>
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

namespace Ast {
    struct NoValue {
        bool operator==(NoValue const &) const { return true; }
        friend std::ostream& operator<<(std::ostream& os, NoValue) { return os; }
    };
    template <typename Tag> struct GenericCommand {};

    namespace tag {
        struct boot {};
        struct help {};
        struct load {};
        struct exit {};
        struct set {};
        struct show {};

        static std::ostream& operator<<(std::ostream& os, boot) { return os << "BOOT"; }
        static std::ostream& operator<<(std::ostream& os, help) { return os << "HELP"; }
        static std::ostream& operator<<(std::ostream& os, load) { return os << "LOAD"; }
        static std::ostream& operator<<(std::ostream& os, exit) { return os << "EXIT"; }
        static std::ostream& operator<<(std::ostream& os, set ) { return os << "SET"; }
        static std::ostream& operator<<(std::ostream& os, show) { return os << "SHOW"; }
    };

    template <> struct GenericCommand<tag::load> { std::string name; };

    template <> struct GenericCommand<tag::set> {
        std::string property;
        boost::variant<NoValue, std::string, bool> value; // optional
    };

    using BootCmd = GenericCommand<tag::boot>;
    using HelpCmd = GenericCommand<tag::help>;
    using ExitCmd = GenericCommand<tag::exit>;
    using ShowCmd = GenericCommand<tag::show>;
    using LoadCmd = GenericCommand<tag::load>;
    using SetCmd  = GenericCommand<tag::set>;

    using Command = boost::variant<BootCmd, HelpCmd, ExitCmd, ShowCmd, LoadCmd, SetCmd>;
    using Commands = std::list<Command>;

    template <typename Tag>
    static inline std::ostream& operator<<(std::ostream& os, Ast::GenericCommand<Tag> const& command) { 
        return os << Tag{} << " " << boost::fusion::as_vector(command);
    }
}

BOOST_FUSION_ADAPT_TPL_STRUCT((Tag), (Ast::GenericCommand) (Tag), )
BOOST_FUSION_ADAPT_STRUCT(Ast::LoadCmd, name)
BOOST_FUSION_ADAPT_STRUCT(Ast::SetCmd, property, value)

template <typename Iterator>
struct command_grammar : qi::grammar<Iterator, Ast::Commands()> {
    command_grammar() : command_grammar::base_type(start) {
        using namespace qi;

        start = skip(blank) [lazy_command % eol];

        // nabialek trick
        lazy_command = no_case [ commands [ _a = _1 ] > lazy(*_a) [ _val = _1 ] ];

        on_off.add("on", true)("off", false);

        commands.add
            ("-help", &help_command) ("-hel", &help_command) ("-he", &help_command) ("-h", &help_command)
            ("/help", &help_command) ("/hel", &help_command) ("/he", &help_command) ("/h", &help_command)
            ("help", &help_command) ("hel", &help_command) ("he", &help_command) ("h", &help_command)
            ("boot", &boot_command) ("boo", &boot_command) ("bo", &boot_command) ("b", &boot_command)
            ("exit", &exit_command) ("exi", &exit_command) ("ex", &exit_command) ("e", &exit_command)
            ("quit", &exit_command) ("qui", &exit_command) ("qu", &exit_command) ("q", &exit_command)
            ("load", &load_command) ("loa", &load_command) ("lo", &load_command) ("l", &load_command)
            ("set", &set_command) ("se", &set_command)
            ("show", &show_command) ("sho", &show_command) ("sh", &show_command);

        quoted_string = '"' >> +~char_('"') >> '"';

        // nullary commands
        boot_command_ = eps;
        exit_command_ = eps;
        help_command_ = eps;
        show_command_ = eps;

        // non-nullary commands
        load_command_ = quoted_string;
        drive_        = char_("A-Z") >> ':';
        set_command_  = no_case[lit("drive")|"driv"|"dri"|"dr"] >> attr("DRIVE") >> drive_
                | no_case[ (lit("debug")|"debu"|"deb"|"de")     >> attr("DEBUG") >> on_off ]
                | no_case[ (lit("trace")|"trac"|"tra"|"tr"|"t") >> attr("TRACE") >> on_off ]
                ;

        BOOST_SPIRIT_DEBUG_NODES(
                (start)(lazy_command)
                (boot_command) (exit_command) (help_command) (show_command) (set_command) (load_command)
                (boot_command_)(exit_command_)(help_command_)(show_command_)(set_command_)(load_command_)
                (quoted_string)(drive_)
            )

        on_error<fail>(start, error_handler_(_4, _3, _2));
        on_error<fail>(lazy_command, error_handler_(_4, _3, _2));
        boot_command = boot_command_;
        exit_command = exit_command_;
        help_command = help_command_;
        load_command = load_command_;
        exit_command = exit_command_;
        set_command  = set_command_;
        show_command = show_command_;
    }

  private:
    struct error_handler_t {
        template <typename...> struct result { typedef void type; };

        void operator()(qi::info const &What, Iterator Err_pos, Iterator Last) const {
            std::cout << "Error! Expecting " << What << " here: "" << std::string(Err_pos, Last) << """ << std::endl;
        }
    };

    boost::phoenix::function<error_handler_t> const error_handler_ = error_handler_t {};

    qi::rule<Iterator, Ast::Commands()> start;

    using Skipper = qi::blank_type;
    using CommandRule  = qi::rule<Iterator, Ast::Command(), Skipper>;

    qi::symbols<char, bool> on_off;
    qi::symbols<char, CommandRule const*> commands;

    qi::rule<Iterator, std::string()> drive_property, quoted_string, drive_;

    qi::rule<Iterator, Ast::Command(), Skipper, qi::locals<CommandRule const*> > lazy_command;
    CommandRule boot_command, exit_command, help_command, load_command, set_command, show_command;

    qi::rule<Iterator, Ast::BootCmd(), Skipper> boot_command_;
    qi::rule<Iterator, Ast::ExitCmd(), Skipper> exit_command_;
    qi::rule<Iterator, Ast::HelpCmd(), Skipper> help_command_;
    qi::rule<Iterator, Ast::LoadCmd(), Skipper> load_command_;
    qi::rule<Iterator, Ast::SetCmd(),  Skipper> set_command_;
    qi::rule<Iterator, Ast::ShowCmd(), Skipper> show_command_;
};

int main() {
    typedef std::string::const_iterator It;
    command_grammar<It> const commands;

    for (std::string const input : {
            "help",
            "set drive C:",
            "SET DRIVE C:",
            "loAD "XYZ"",
            "load "anything 
at all"",
            // multiline
            "load "ABC"
help
-he
/H
sh
se t off
se debug ON
b
q"
            })
    {
        std::cout << "----- '" << input << "' -----
";
        It f = input.begin(), l = input.end();

        Ast::Commands parsed;
        bool result = parse(f, l, commands, parsed);

        if (result) {
            for (auto& cmd : parsed) {
                std::cout << "Parsed " << cmd << "
";
            }
        } else {
            std::cout << "Parse failed
";
        }

        if (f != l) {
            std::cout << "Remaining unparsed '" << std::string(f, l) << "'
";
        }
    }
}

POST-SCRIPT

Q. How do I annotate the parser to create an AST using utree?

  • See above

Q. How do I walk the utree after it is built, to discover what was parsed?

  • See above, see also http://www.boost.org/doc/libs/1_64_0/doc/html/variant/tutorial.html

Q. I want to add a comment character, "!". So, how can I ignore everything after that - except when it occurs in a quoted string?

  • Simply make the Skipper type a rule that parses e.g.:

    qi::rule<Iterator> my_skipper;
    my_skipper = blank | '!' >> *(char_ - eol) >> (eol|eoi);
    

    Then use it instead of skip(blank) like skip(my_skipper)

Q. Why doesn't my error handler get called when I give it invalid input?

  • Because you didn't mark expectation points (operator> instead of operator>>). If you don't, a failure to match a sub-expression simply backtracks.

Q. How can I make the command tokens case insensitive, but not change the contents of a quoted string?

  • See above

这篇关于使用 Boost Spirit 解析命令语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本文标题为:使用 Boost Spirit 解析命令语言

基础教程推荐