Troubles with boost::spirit::lex amp; whitespace(boost::spirit::lex amp; 问题空白)
问题描述
我尝试学习使用 boost::spirit.为此,我想创建一些简单的词法分析器,将它们组合起来,然后开始使用精神进行解析.我尝试修改示例,但它没有按预期运行(结果 r 不正确).
I try learning to use boost::spirit. To do that, I wanted to create some simple lexer, combine them and then start parsing using spirit. I tried modifying the example, but it doesn't run as expected (the result r isn't true).
这是词法分析器:
#include <boost/spirit/include/lex_lexertl.hpp>
namespace lex = boost::spirit::lex;
template <typename Lexer>
struct lexer_identifier : lex::lexer<Lexer>
{
lexer_identifier()
: identifier("[a-zA-Z_][a-zA-Z0-9_]*")
, white_space("[ \t\n]+")
{
using boost::spirit::lex::_start;
using boost::spirit::lex::_end;
this->self = identifier;
this->self("WS") = white_space;
}
lex::token_def<> identifier;
lex::token_def<> white_space;
std::string identifier_name;
};
这是我尝试运行的示例:
And this is the example I'm trying to run:
#include "stdafx.h"
#include <boost/spirit/include/lex_lexertl.hpp>
#include "my_Lexer.h"
namespace lex = boost::spirit::lex;
int _tmain(int argc, _TCHAR* argv[])
{
typedef lex::lexertl::token<char const*,lex::omit, boost::mpl::false_> token_type;
typedef lex::lexertl::lexer<token_type> lexer_type;
typedef lexer_identifier<lexer_type>::iterator_type iterator_type;
lexer_identifier<lexer_type> my_lexer;
std::string test("adedvied das934adf dfklj_03245");
char const* first = test.c_str();
char const* last = &first[test.size()];
lexer_type::iterator_type iter = my_lexer.begin(first, last);
lexer_type::iterator_type end = my_lexer.end();
while (iter != end && token_is_valid(*iter))
{
++iter;
}
bool r = (iter == end);
return 0;
}
r 为真,只要字符串中只有一个标记.为什么会这样?
r is true as long as there is only one token inside the string. Why is this the case?
问候托比亚斯
推荐答案
您已经创建了第二个词法分析器状态,但从未调用过它.
You have created a second lexer state, but never invoked it.
在大多数情况下,获得预期效果的最简单方法是在可跳过的标记上使用带有 pass_ignore
标志的单状态词法分析:
For most cases, the easiest way to have the desired effect would be to use single-state lexing with a pass_ignore
flag on the skippable tokens:
this->self += identifier
| white_space [ lex::_pass = lex::pass_flags::pass_ignore ];
请注意,这需要一个 actor_lexer
来允许语义操作:
Note that this requires an actor_lexer
to allow for the semantic action:
typedef lex::lexertl::actor_lexer<token_type> lexer_type;
完整样本:
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
namespace lex = boost::spirit::lex;
template <typename Lexer>
struct lexer_identifier : lex::lexer<Lexer>
{
lexer_identifier()
: identifier("[a-zA-Z_][a-zA-Z0-9_]*")
, white_space("[ \t\n]+")
{
using boost::spirit::lex::_start;
using boost::spirit::lex::_end;
this->self += identifier
| white_space [ lex::_pass = lex::pass_flags::pass_ignore ];
}
lex::token_def<> identifier;
lex::token_def<> white_space;
std::string identifier_name;
};
int main(int argc, const char *argv[])
{
typedef lex::lexertl::token<char const*,lex::omit, boost::mpl::false_> token_type;
typedef lex::lexertl::actor_lexer<token_type> lexer_type;
typedef lexer_identifier<lexer_type>::iterator_type iterator_type;
lexer_identifier<lexer_type> my_lexer;
std::string test("adedvied das934adf dfklj_03245");
char const* first = test.c_str();
char const* last = &first[test.size()];
lexer_type::iterator_type iter = my_lexer.begin(first, last);
lexer_type::iterator_type end = my_lexer.end();
while (iter != end && token_is_valid(*iter))
{
++iter;
}
bool r = (iter == end);
std::cout << std::boolalpha << r << "
";
}
印刷品
true
WS"作为船长状态
<小时>您也有可能遇到过使用第二个解析器状态作为船长的示例 (lex::tokenize_and_phrase_parse
).让我花一分钟或 10 分钟来为此创建一个工作示例.
"WS" as a Skipper state
It is also possible you came across a sample that uses the second parser state for the skipper (lex::tokenize_and_phrase_parse
). Let me take a minute or 10 to create a working sample for that.
更新 花了我 10 多分钟 (waaaah) :) 这是一个对比测试,展示了词法分析器状态如何交互,以及如何使用 Spirit Skipper 解析来调用第二个解析器状态:
Update Took me a bit more than 10 minutes (waaaah) :) Here's a comparative test, showing how the lexer states interact, and how to use Spirit Skipper parsing to invoke the second parser state:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
namespace lex = boost::spirit::lex;
namespace qi = boost::spirit::qi;
template <typename Lexer>
struct lexer_identifier : lex::lexer<Lexer>
{
lexer_identifier()
: identifier("[a-zA-Z_][a-zA-Z0-9_]*")
, white_space("[ \t\n]+")
{
this->self = identifier;
this->self("WS") = white_space;
}
lex::token_def<> identifier;
lex::token_def<lex::omit> white_space;
};
int main()
{
typedef lex::lexertl::token<char const*, lex::omit, boost::mpl::true_> token_type;
typedef lex::lexertl::lexer<token_type> lexer_type;
typedef lexer_identifier<lexer_type>::iterator_type iterator_type;
lexer_identifier<lexer_type> my_lexer;
std::string test("adedvied das934adf dfklj_03245");
{
char const* first = test.c_str();
char const* last = &first[test.size()];
// cannot lex in just default WS state:
bool ok = lex::tokenize(first, last, my_lexer, "WS");
std::cout << "Starting state WS: " << std::boolalpha << ok << "
";
}
{
char const* first = test.c_str();
char const* last = &first[test.size()];
// cannot lex in just default state either:
bool ok = lex::tokenize(first, last, my_lexer, "INITIAL");
std::cout << "Starting state INITIAL: " << std::boolalpha << ok << "
";
}
{
char const* first = test.c_str();
char const* last = &first[test.size()];
bool ok = lex::tokenize_and_phrase_parse(first, last, my_lexer, *my_lexer.self, qi::in_state("WS")[my_lexer.self]);
ok = ok && (first == last); // verify full input consumed
std::cout << std::boolalpha << ok << "
";
}
}
输出是
Starting state WS: false
Starting state INITIAL: false
true
这篇关于boost::spirit::lex & 问题空白的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:boost::spirit::lex & 问题空白
基础教程推荐
- 使用从字符串中提取的参数调用函数 2022-01-01
- 如何“在 Finder 中显示"或“在资源管理器中显 2021-01-01
- 如何使图像调整大小以在 Qt 中缩放? 2021-01-01
- 如何在不破坏 vtbl 的情况下做相当于 memset(this, ...) 的操作? 2022-01-01
- 为什么语句不能出现在命名空间范围内? 2021-01-01
- 管理共享内存应该分配多少内存?(助推) 2022-12-07
- 从 std::cin 读取密码 2021-01-01
- 在 C++ 中循环遍历所有 Lua 全局变量 2021-01-01
- 为 C/C++ 中的项目的 makefile 生成依赖项 2022-01-01
- Windows Media Foundation 录制音频 2021-01-01