Optional Replacement
The Beesley and Karttunen Book gives an incorrect explanation of the
semantics of the optional replace opearator (->). It
says, on p. 66,
[A (->) B]
denotes an optional replacement, that is, the union of [A -> B]
with the identity relation on A.
That is incorrect, as you can see by comparing the networks compiled
with [a
(->) b] and [[a -> b] | a]:
Regular expression
|
Network
|
| [a
(->) b] |
fs0:
? -> fs0, a -> fs0, b -> fs0, a:b -> fs0. |
| [[a
-> b] | a] |
fs0:
? -> fs1, a -> fs2, b -> fs1, a:b -> fs1.
fs1: ? -> fs1, b -> fs1, a:b -> fs1.
fs2: (no arcs) |
The correct statement is
[A (->) B]
denotes an optional replacement. It is equivalent to [[A -> B] | ?*]*.
The right hand side of the union must be ?*, here the
universal identity relation that maps any string into itself.
Furthermore, we need the zero-plus (Kleene-star) operator around the
union to ensure that any instance of "a" in the upper side string
corresponds both to a "b" and to an "a". For example, the
input "aa" must produce four output strings:
xfst[0]: read regex
[[a -> b] | ?*]*;
360 bytes. 1 state, 4 arcs, Circular.
xfst[1]:
apply
down aa
aa
ab
ba
bb
Without the outmost *
operator, we would only get two outputs, "aa" and "bb".
Thanks to Gary A. Coen for spotting the error!
Last Modified: Tuesday, 10-Jul-2007 18:39:49 PDT