This file is indexed.

/usr/share/doc/flex-doc/html/Why-doesn_0027t-flex-have-non_002dgreedy-operators-like-perl-does_003f.html is in flex-doc 2.6.4-6.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- 
The flex manual is placed under the same licensing conditions as the
rest of flex:

Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2012
The Flex Project.

Copyright (C) 1990, 1997 The Regents of the University of California.
All rights reserved.

This code is derived from software contributed to Berkeley by
Vern Paxson.

The United States Government has rights in this work pursuant
to contract no. DE-AC03-76SF00098 between the United States
Department of Energy and the University of California.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

Neither the name of the University nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.

THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. -->
<!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Why doesn't flex have non-greedy operators like perl does? (Lexical Analysis With Flex, for Flex 2.6.4)</title>

<meta name="description" content="Why doesn't flex have non-greedy operators like perl does? (Lexical Analysis With Flex, for Flex 2.6.4)">
<meta name="keywords" content="Why doesn't flex have non-greedy operators like perl does? (Lexical Analysis With Flex, for Flex 2.6.4)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<link href="index.html#Top" rel="start" title="Top">
<link href="Indices.html#Indices" rel="index" title="Indices">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="FAQ.html#FAQ" rel="up" title="FAQ">
<link href="Memory-leak-_002d-16386-bytes-allocated-by-malloc_002e.html#Memory-leak-_002d-16386-bytes-allocated-by-malloc_002e" rel="next" title="Memory leak - 16386 bytes allocated by malloc.">
<link href="Whenever-flex-can-not-match-the-input-it-says-_0022flex-scanner-jammed_0022_002e.html#Whenever-flex-can-not-match-the-input-it-says-_0022flex-scanner-jammed_0022_002e" rel="prev" title="Whenever flex can not match the input it says &quot;flex scanner jammed&quot;.">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
blockquote.indentedblock {margin-right: 0em}
blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
blockquote.smallquotation {font-size: smaller}
div.display {margin-left: 3.2em}
div.example {margin-left: 3.2em}
div.lisp {margin-left: 3.2em}
div.smalldisplay {margin-left: 3.2em}
div.smallexample {margin-left: 3.2em}
div.smalllisp {margin-left: 3.2em}
kbd {font-style: oblique}
pre.display {font-family: inherit}
pre.format {font-family: inherit}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
pre.smalldisplay {font-family: inherit; font-size: smaller}
pre.smallexample {font-size: smaller}
pre.smallformat {font-family: inherit; font-size: smaller}
pre.smalllisp {font-size: smaller}
span.nolinebreak {white-space: nowrap}
span.roman {font-family: initial; font-weight: normal}
span.sansserif {font-family: sans-serif; font-weight: normal}
ul.no-bullet {list-style: none}
-->
</style>


</head>

<body lang="en">
<a name="Why-doesn_0027t-flex-have-non_002dgreedy-operators-like-perl-does_003f"></a>
<div class="header">
<p>
Next: <a href="Memory-leak-_002d-16386-bytes-allocated-by-malloc_002e.html#Memory-leak-_002d-16386-bytes-allocated-by-malloc_002e" accesskey="n" rel="next">Memory leak - 16386 bytes allocated by malloc.</a>, Previous: <a href="Whenever-flex-can-not-match-the-input-it-says-_0022flex-scanner-jammed_0022_002e.html#Whenever-flex-can-not-match-the-input-it-says-_0022flex-scanner-jammed_0022_002e" accesskey="p" rel="prev">Whenever flex can not match the input it says &quot;flex scanner jammed&quot;.</a>, Up: <a href="FAQ.html#FAQ" accesskey="u" rel="up">FAQ</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Indices.html#Indices" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<a name="Why-doesn_0027t-flex-have-non_002dgreedy-operators-like-perl-does_003f-1"></a>
<h3 class="unnumberedsec">Why doesn&rsquo;t flex have non-greedy operators like perl does?</h3>

<p>A DFA can do a non-greedy match by stopping
the first time it enters an accepting state, instead of consuming input until
it determines that no further matching is possible (a &ldquo;jam&rdquo; state).  This
is actually easier to implement than longest leftmost match (which flex does).
</p>
<p>But it&rsquo;s also much less useful than longest leftmost match.  In general,
when you find yourself wishing for non-greedy matching, that&rsquo;s usually a
sign that you&rsquo;re trying to make the scanner do some parsing.  That&rsquo;s
generally the wrong approach, since it lacks the power to do a decent job.
Better is to either introduce a separate parser, or to split the scanner
into multiple scanners using (exclusive) start conditions.
</p>
<p>You might have
a separate start state once you&rsquo;ve seen the &lsquo;<samp>BEGIN</samp>&rsquo;. In that state, you
might then have a regex that will match &lsquo;<samp>END</samp>&rsquo; (to kick you out of the
state), and perhaps &lsquo;<samp>(.|\n)</samp>&rsquo; to get a single character within the chunk ...
</p>
<p>This approach also has much better error-reporting properties.
</p>



</body>
</html>