This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Matching against a PushbackInputStream

4 replies
Markus Kahl
Joined: 2011-11-26,
User offline. Last seen 26 weeks 5 days ago.

Hey,

I'm implementing a message-based middleware (in the course of the
respective lecture at University) and wanted to implement the reading
of messages in terms of extractors, so I could write something like
that:

middleware.receive {
case Register(name, pwd) => println("%s trying to register" format
name)
case Broadcast(name, text) => ...
}

Just a made up example. Anyway, I've implemented it using a
PushbackInputStream and extractors that push back
data if they don't understand it (actually only the first byte which
is a message tag).
Everything works fine if I match directly like this:

middleware.in match { ... }

Though I'd like to have a fallback case which sends an error message
to the other side among other things if no message matches. For this
the #receive from the first example was intended.
Though it doesn't quite work if I provide a real partial function like
in that example.

That is because if the function does not cover all cases, #unapply is
called twice for a matching extractor.
Once for #isDefined and once for actually getting the result. If it's
a complete function like

middleware.receive {
case Register....
case _ => ...
}

#unapply is only called once directly for getting the result. The
problem is that I can't cope with #unapply being called either once or
twice. I would like it to be possible to for the user to provide a
default case or not.
But if #unapply is called twice (when there is none) everything
breaks, because I can read the data only once and then it's gone.

I've actually come up with a solution for that, but it breaks the
'full case'.

Here's some code which hopefully makes clear what I am talking about:

https://gist.github.com/1395541

Number.unapply(0 ... 0 0 0 42) <- first call from #isDefined
Number.unapply(empty ... ) <- second call

Any ideas how one could pull this off?
I would like both cases to work.

Regards,

Markus

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Matching against a PushbackInputStream

Just don't match on the InputStream. Create a Scala (collection)
Stream out of it, and use that for matching.

On Sat, Nov 26, 2011 at 10:40, Markus Kahl wrote:
> Hey,
>
> I'm implementing a message-based middleware (in the course of the
> respective lecture at University) and wanted to implement the reading
> of messages in terms of extractors, so I could write something like
> that:
>
> middleware.receive {
>  case Register(name, pwd) => println("%s trying to register" format
> name)
>  case Broadcast(name, text) => ...
> }
>
> Just a made up example. Anyway, I've implemented it using a
> PushbackInputStream and extractors that push back
> data if they don't understand it (actually only the first byte which
> is a message tag).
> Everything works fine if I match directly like this:
>
> middleware.in match { ... }
>
> Though I'd like to have a fallback case which sends an error message
> to the other side among other things if no message matches. For this
> the #receive from the first example was intended.
> Though it doesn't quite work if I provide a real partial function like
> in that example.
>
> That is because if the function does not cover all cases, #unapply is
> called twice for a matching extractor.
> Once for #isDefined and once for actually getting the result. If it's
> a complete function like
>
> middleware.receive {
>  case Register....
>  case _ => ...
> }
>
> #unapply is only called once directly for getting the result. The
> problem is that I can't cope with #unapply being called either once or
> twice. I would like it to be possible to for the user to provide a
> default case or not.
> But if #unapply is called twice (when there is none) everything
> breaks, because I can read the data only once and then it's gone.
>
> I've actually come up with a solution for that, but it breaks the
> 'full case'.
>
> Here's some code which hopefully makes clear what I am talking about:
>
> https://gist.github.com/1395541
>
> Number.unapply(0 ... 0 0 0 42) <- first call from #isDefined
> Number.unapply(empty ... ) <- second call
>
>
> Any ideas how one could pull this off?
> I would like both cases to work.
>
>
> Regards,
>
> Markus
>

Markus Kahl
Joined: 2011-11-26,
User offline. Last seen 26 weeks 5 days ago.
Re: Matching against a PushbackInputStream

Sorry, but I don't see how this is gonna help me.
In any case the problem with #unapply being called either once or
twice remains.
How should the Message reader know, whether he should consume the
contents of the Stream or not?
If it's called in the course of #isDefinedAt it must not consume the
data.

Otherwise it will return possibly another message with the next call
or nothing at all if there is no more data.

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: Re: Matching against a PushbackInputStream

Reading and pushing back are side effects. Side effects don't mesh
well with pattern matching, so my suggestion is that you get rid of
them, period.

Right now, you have pbis: PushbackInputStream that is mutated by the
act of matching on it. Like an Iterator, you cannot observe it as a
whole; instead, you call methods on it that change its state. Dump
that.

My suggestion is using a Scala Stream to represent the whole input.
Because Stream is non-strict, it can be effectively infinite, which
makes it a good data structure to represent input that is being
generated as the program processes it. For example, if I had an
Iterator "it" which I wanted to convert into a Stream (without using
the .toStream method), it could be done like this:

Stream.continually(it).takeWhile(_.hasNext).map(_.next)

Now, you don't need to push anything back anymore because you have
access to all of it, all the time. What you do, instead, is return the
*unprocessed* part of the Stream together with whatever else you do as
a result of the match. Something like this:

middleware.receive {
case Register(name, pwd) #:: rest =>
println("%s trying to register" format name)
rest
case Broadcast(name, text) #:: rest =>
...
rest
}

As a practical example of this, using similar data structures, this is
precisely how the parser combinators library works.

Now, this kind of problem looks very well suited to iteratees. I
expected some people in this list to mention them, as I'm not
comfortable enough with that concept to explain its usage.

On Sat, Nov 26, 2011 at 11:37, Markus Kahl wrote:
> Sorry, but I don't see how this is gonna help me.
> In any case the problem with #unapply being called either once or
> twice remains.
> How should the Message reader know, whether he should consume the
> contents of the Stream or not?
> If it's called in the course of #isDefinedAt it must not consume the
> data.
>
> Otherwise it will return possibly another message with the next call
> or nothing at all if there is no more data.
>

Markus Kahl
Joined: 2011-11-26,
User offline. Last seen 26 weeks 5 days ago.
Re: Matching against a PushbackInputStream

Hey,

thanks a lot for your suggestion. I've actually worked with parser
combinators before and
have written my own parsers. Don't know why this hasn't come to my
mind.
As you said, it's pretty much the same, really.

Although with the current version 'tainted' with side-effects the user
has
to write a little less code, the solution suggested by you is
defnitely cleaner.

I haven't heard the term iteratees before. I shall look it up.
Thank you.

Regards,

Markus

On 27 Nov., 19:04, Daniel Sobral wrote:
> Reading and pushing back are side effects. Side effects don't mesh
> well with pattern matching, so my suggestion is that you get rid of
> them, period.
>
> Right now, you have pbis: PushbackInputStream that is mutated by the
> act of matching on it. Like an Iterator, you cannot observe it as a
> whole; instead, you call methods on it that change its state. Dump
> that.
>
> My suggestion is using a Scala Stream to represent the whole input.
> Because Stream is non-strict, it can be effectively infinite, which
> makes it a good data structure to represent input that is being
> generated as the program processes it. For example, if I had an
> Iterator "it" which I wanted to convert into a Stream (without using
> the .toStream method), it could be done like this:
>
> Stream.continually(it).takeWhile(_.hasNext).map(_.next)
>
> Now, you don't need to push anything back anymore because you have
> access to all of it, all the time. What you do, instead, is return the
> *unprocessed* part of the Stream together with whatever else you do as
> a result of the match. Something like this:
>
> middleware.receive {
>  case Register(name, pwd) #:: rest =>
>     println("%s trying to register" format name)
>     rest
>  case Broadcast(name, text) #:: rest =>
>     ...
>     rest
>
> }
>
> As a practical example of this, using similar data structures, this is
> precisely how the parser combinators library works.
>
> Now, this kind of problem looks very well suited to iteratees. I
> expected some people in this list to mention them, as I'm not
> comfortable enough with that concept to explain its usage.
>
>
>
>
>
> On Sat, Nov 26, 2011 at 11:37, Markus Kahl wrote:
> > Sorry, but I don't see how this is gonna help me.
> > In any case the problem with #unapply being called either once or
> > twice remains.
> > How should the Message reader know, whether he should consume the
> > contents of the Stream or not?
> > If it's called in the course of #isDefinedAt it must not consume the
> > data.
>
> > Otherwise it will return possibly another message with the next call
> > or nothing at all if there is no more data.
>
> > -- Markus
>
> > On 26 Nov., 14:30, Daniel Sobral wrote:
> >> Just don't match on the InputStream. Create a Scala (collection)
> >> Stream out of it, and use that for matching.
>
> >> On Sat, Nov 26, 2011 at 10:40, Markus Kahl wrote:
> >> > Hey,
>
> >> > I'm implementing a message-based middleware (in the course of the
> >> > respective lecture at University) and wanted to implement the reading
> >> > of messages in terms of extractors, so I could write something like
> >> > that:
>
> >> > middleware.receive {
> >> >  case Register(name, pwd) => println("%s trying to register" format
> >> > name)
> >> >  case Broadcast(name, text) => ...
> >> > }
>
> >> > Just a made up example. Anyway, I've implemented it using a
> >> > PushbackInputStream and extractors that push back
> >> > data if they don't understand it (actually only the first byte which
> >> > is a message tag).
> >> > Everything works fine if I match directly like this:
>
> >> > middleware.in match { ... }
>
> >> > Though I'd like to have a fallback case which sends an error message
> >> > to the other side among other things if no message matches. For this
> >> > the #receive from the first example was intended.
> >> > Though it doesn't quite work if I provide a real partial function like
> >> > in that example.
>
> >> > That is because if the function does not cover all cases, #unapply is
> >> > called twice for a matching extractor.
> >> > Once for #isDefined and once for actually getting the result. If it's
> >> > a complete function like
>
> >> > middleware.receive {
> >> >  case Register....
> >> >  case _ => ...
> >> > }
>
> >> > #unapply is only called once directly for getting the result. The
> >> > problem is that I can't cope with #unapply being called either once or
> >> > twice. I would like it to be possible to for the user to provide a
> >> > default case or not.
> >> > But if #unapply is called twice (when there is none) everything
> >> > breaks, because I can read the data only once and then it's gone.
>
> >> > I've actually come up with a solution for that, but it breaks the
> >> > 'full case'.
>
> >> > Here's some code which hopefully makes clear what I am talking about:
>
> >> >https://gist.github.com/1395541
>
> >> > Number.unapply(0 ... 0 0 0 42) <- first call from #isDefined
> >> > Number.unapply(empty ... ) <- second call
>
> >> > Any ideas how one could pull this off?
> >> > I would like both cases to work.
>
> >> > Regards,
>
> >> > Markus
>
> >> --
> >> Daniel C. Sobral
>
> >> I travel to the future all the time.
>
> --
> Daniel C. Sobral
>
> I travel to the future all the time.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland