środa, 29 czerwca 2011

Recently, Mark Needham asked an interresting question: in what cases Scala structural types can be legitimately combined with self-type annotations.
Let me digress here to present the context first.
This all has begun with Mark's post that presents a code snippet combining the two features. Unlike some people I wasn't really excited about the example, quite opposite to be honest.
I expressed my criticism at the post comment, but I'll repeat the points here:

1. Structural types are optimal to uniformly handle types, that share some members, yet have no common ancestor. Like Socket and ServerSocket in java.net - both have close(), isConnected(), isBound(), getInetAddress() methods, yet you cannot treat them uniformly in Java. Moreover you cannot change those classes, because they are from the standard library. In the Mark's example all classes were under his control, so he could make them implement a common trait easily.

2. Self-type annotations shine when you have a coherent set of abstract members you want to have available for your trait's methods. Just wrap the related members into a separate trait and declare it as your self type. The first benefit you get is that the syntax clearly communicates the connection between the abstract members (they are grouped in one type). The other, your traits are now connected at one level higher, without the need for enumerating all necessary abstract members in your main trait.

There is little sense in using structural types for classes that you control. Often there is also little sense in extracting the only abstract member into its own type (named or structural, regardless) just to declare it as a self-type. I don't mean always, because in some cases you want it to communicate, that the abstract member is a (possibly only, but still) part of some important, independent concept.
In Mark's example, I just couldn't tell what concept is the "val peopleNodes: NodeSeq" member a part of. For that reason I found introduction of self-type there somewhat artificial.

Then there comes a question: is there a valid use case for combining structural types with self-type annotations. That was hard question for me, because I've never imagined such combination, nor met one. Nevertheless I've tried to imagine a case, where it could make sense.

So first, we should have at least two classes without a common ancestor, but structurally similar in order to meaningfully apply the Scala's structural types. Socket and ServerSocket from java.net came to my mind.
Second, we need to have trait with a bunch of abstract members and we should be able to divide the members into separate coherent groups in order for self-type annotation to make sense. I've invented a totally artificial "LoggableSocket" trait, that represents a socket-like object with an ability to log its status in some form. Without structural types and self-type annotations it (and its use) would look like:

import java.net._

trait LoggableSocket {
  def isConnected(): Boolean
  def isBound(): Boolean
  def log(message: String): Unit
  def logStatus() = {
    log("Socket status: connected [" + isConnected() +
      "], bound [" + isBound() + "]")
  }
}

// helper trait
trait SysoutLogging {
  def log(s: String) = println(s)
}
// usage
val s = new Socket() with LoggableSocket with SysoutLogging
s.logStatus()


Of course at least one thing is not that nice here: the abstract members related to Sockets are intermixed with the member related to logging. Let's fix it by extracting them to their own type and then declare it as the LoggableSocket's self-type:

trait SocketLike {
  def isConnected(): Boolean
  def isBound(): Boolean
}

trait LoggableSocket {
  self: SocketLike =>
  def log(message: String): Unit
  def logStatus() = {
    log("Socket status: connected [" + isConnected() +
      "], bound [" + isBound() + "]")
  }
}
// ...
val s = new Socket() with LoggableSocket with SysoutLogging // Ups! Does not compile

What has just happened? It doesn't compile, because Socket doesn't implement the SocketLike trait, which we've just declared as self-type dependency of LoggableSocket. Socket is a JDK class, so we cannot make it implement that interface.
We could either instantiate it adding another "with" after the constructor:

val s = new Socket() with SocketLike with LoggableSocket with SysoutLogging

(but the number of "with"s quickly gets out of control) or, we could - yes - introduce a structural type here:

trait LoggableSocket {
  self: {
    def isConnected(): Boolean
    def isBound(): Boolean
  } =>
  def log(message: String): Unit
  def logStatus() = {
    log("Socket status: connected [" + isConnected() +
      "], bound [" + isBound() + "]")
  }
}
// ...
val s = new Socket() with LoggableSocket with SysoutLogging

...and everything works like a charm again. This time the related members are grouped into a meaningful way. That clearly communicates, that they are of a different nature from the third abstract member, the log method. In addition this works for impossible-to-modify JDK classes.

So as you can see - yes, combining the two features: self-type annotations and structured types occasionally makes sense.
Though I would be far from calling that example natural or a real-world one :)
I really doubt you would find many cases, when the combination makes sense. But maybe I just cannot come up with an example? Would be very interrested if you found a valid real-world one.

sobota, 11 czerwca 2011

SBT Installer Ubuntizer

Installation of Java-based development tools was often a pain in the neck for me and something ridiculously poor, compared to brilliant installation of Ubuntu packages. Luckily now we got Java, Tomcat, Maven at least packaged properly - kudos to package maintainers! I hate this whole "unpack tarball, set XXX_HOME, then update PATH" theme with passion. I want to code instead of fighting environment!

In the Scala land, those things are yet to be improved. You got cross-platform installer for Scala, but the excellent SBT still requires you to follow some manual steps.
Because I don't see much activity in this area, I've decided to stop complaining and instead to do something about it. At least one small step, to begin with.

Today I've packed the SBT installer into a valid sbt-0.7.7.deb package (hosted at my Dropbox account). This was my first Debian package, so I made it very minimal - just to have it installable and uninstallable (at least on Ubuntu, because Debian may require copyrights, policies - I don't really know).

EDIT: Now, SBT 0.10.0 as a Ubuntu package (also from my Dropbox) is available too.

Creating the package as a one-off task would not be very compelling for a person like me, because I'd hate repeating manual work. Therefore I've created a script, that first lookups the latest jar at the Google Code page, then downloads the jar and then packs it into a deb archive. I've written it so it can adapt to minor version changes, but of course I doubt it'll be that future-proof. Regardless, the foundation is now laid, and some minor tweaks will hopefully suffice when new versions of SBT are released.

I've pushed the script and helper files to the Github, so feel free to fork and to improve!

czwartek, 20 stycznia 2011

Slicing the Cake

If you are interrested in Scala, you've probably heard about Cake pattern. If not, there are good sources available online.

Cake is often considered as the most idiomatic way to do Dependency Injection in Scala. Yet surprisingly, until now nobody (to my knowledge) has stated publicly, that it is actually a composite pattern, and that DI is only one part of it.
The result is that people, who just want DI apply Cake pattern with its full complexity. Usually that complexity is totally unnecessary and only makes code harder to understand.

Some people have already noticed, that something is wrong with that. I've also had a gut-feeling, that something is a bit too clever for me, when we were starting a messaging gateway project for GSM operator in Scala. Therefore we've sticked with old-fashioned yet proven manual constructor injection, something along the lines of DIY-DI. Although people expressed their criticism of the pattern, nobody stated precisely what the problem was, though. Neither me, until recently. Just few days ago it struck me. I've done a little research, re-read the original paper, and got it. Let me slice that Cake to its constituent layers to see, what it has inside. Then you will clearly see what's the problem with that pattern's common usage.

Self-types as a mean to inject dependencies
What is the heart of Dependency Injection? With DI, components neither lookup nor create their own collaborators. They simply declare, what are their dependencies and rely on external code (usually some framework or manual factory) to provide that dependencies for them. Component can declare its dependencies in many ways, chosen by developer and constrained by DI framework used. For example dependencies can be injected via constructor, via setters, via private field. The actual injection is either done manually (in a factory class), or via framework, that usually is configured via a mix of annotations, XML, plain code and some defaults.
The main benefit of so-achieved DI is freedom to independently test individual components in isolation. As a bonus you also get an explicit view of all components' dependencies and gain an order in component lifecycles, not to mention some frameworks' additional benefits, like AOP.
Scala adds one more method for a component to declare its dependencies: it's self-type annotations.
Technically, declaring a trait's (let's say BuyerAgent) self-type as (let's say) WebCrawler is similar to declaring all members of WebCrawler in this trait. The self-type (in this case type WebCrawler) becomes the dependency of our BuyerAgent, because now we can refer to WebCrawler's members from our BuyerAgent's methods.

trait WebCrawler {
def goTo(target: URL)
def pageSource: String
}

trait BuyerAgent { this: WebCrawler =>

def buyItem() = {
// all WebCrawler members are accessible
// thanks to declared self-type
goTo(new URL("http://ebay.com"))
val ps = pageSource
// if item price is ok, click "buy"
}
}

Note, that this dependency is in a sense more intimate, than one declared (let's say) via constructor parameter and kept in a private field. Our example trait does not HAVE its WebCrawler accessible via a field, it IS the WebCrawler. In practice, this enables you to skip the references to WebCrawler field when using WebCrawler's features (because there is no field in the first place).

/* Traditional, constructor-based DI
(note only classes can have constructors) */
class BuyerAgent(crawler: WebCrawler) {

def buyItem() = {
// all references to WebCrawler members
// are explicit
crawler.goTo(new URL("http://ebay.com"))
val ps = crawler.pageSource
// ...
}
}

This is both good and bad in my opinion. It's best, when dependencies declared in that way really model parts or layers of that component (trait). Used this way, self-types allow you to conveniently split one, big component into many independently-testable and replacable layers, at the same time stressing the tight connection between them. Extracting them to separate classes would make them look as more independent, than they are. Though in my opinion, self-types can be also abused, when component uses it to obtain a dependency, that certainly IS NOT its part. That is pretty much a matter of taste probably.

So going back to Cake, let's look at it. What does it have to offer in terms of DI? Idiomatic usage of self-type. What other Cake's features do you need to implement DI? Absolutely none. If you applied the whole Cake only in order to get DI, you can safely remove Cake's other elements. Your code will immediately get more clarity and simplicity with benefit for you and your team.

So, what are other Cake's constructs (like nested types) for?

Two things.
Unwanted interchanging component's parts prevention.
First, nested classes in Scala work slightly differently than in Java. If you create more than one instance of your component, you won't be able to interchange their inner parts. For example, if your component has inner class named Part, you won't be able to take a single Part from it and pass it to your second component's method (even if that would be no problem in Java, since the types look the same). Such constraint has been enforced in the Scala compiler code. If you can benefit from such protection, Cake's nested classes are for you. Personally I haven't seen a need to use that yet (I see no problem in mixing inner parts between different instances), though I would be interrested to see a case, when presence of that feature makes a real difference. Maybe in Scala compiler code it really did, I don't know. It just doesn't attract me much enough to make me find it out for myself.

Modeling families of types that vary together covariantly
Second, nested classes have access to all members of enclosing trait. If you move them outside the enclosing trait, access to that members is lost. In the original paper, two nested classes of SubjectObserver type refer to its common type members - type of Subject and type of Observer. Both type members have variance annotation. This way, when you extend SubjectObserver refining type members, your updated types immediately propagate to both nested classes and compiler helps you get the variance right. That lets you enforce some domain constraints when you design reusable components for others. In the SubjectObserver example, that forces all extending types to refer to each other in a consistent manner.

So besides of DI, we got two distinct features of Cake pattern. Each of them allows us to enforce some design decisions with the help of compiler. In one case, this is increased integrity of component instances, in second, ability to model type families, that change together. None of that features is so widely popularized like the Dependency Injection aspect and probably they deserve a better exploration.

Nevertheless, Cake is a composite pattern and DI constitues just a part of it. If you consider using it to get DI, just use only self-types and you'll be better off. Nested classes' part of Cake will only obscure your code, when you don't plan to leverage their specific features. As you can see, you probably aren't going to need them most of the time, so it's much better to keep things simple.

Why is the pattern most often applied in its full form, when usually only DI matters? I think, that many factors come into play here. Many new concepts at once after coming from Java, insufficiently thourough understanding. Whatever the reaons are, usually we are much better off sticking to the KISS rule, though :)

Originally I planned to add code samples, that could easily show similarities and differences between DI implemented with Cake pattern and with plain, old, constructor injection. Because I'm kind of new to blogging and don't want to spend additional time integrating syntax highlighting now, I'm publishing this post without it. If you are interrested I will add them.
Please share your thoughts on this topic, I'll gladly learn something new and discover new points of view. BTW, if you are native English speaker, I'll appreciate any corrections of my sometimes odd grammar and vocabulary ;) Thanks!