NSDuctTape now supports (rudimentary) two-way communication

This is just a quick note to let you know that NSDuctTape now supports communication in both directions (i.e.-you can now instantiate and manipulate Objective C objects from .NET). The support is still rudimentary, and it's not especially robust, but it's there, and it works. Currently, the new code is only available through Subversion, but I'll probably clean it up a bit and post a new download this weekend.

Bindings and the Model-View-Controller Pattern

There has been a lot of talk lately about the usefulness (or lack thereof) of design patterns in software, but the Model-View-Controller pattern is, without a doubt, one of my favorites. For any who are unfamiliar with it, Model-View-Controller separates the code of an application into three tiers: code that presents an interface to the user (View), code that actually does useful stuff (Model), and code that glues it all together (Controller). On many platforms, such as Apple's Cocoa or Microsoft's WPF, the view may be partially (or completely) implemented without explicitly writing any code, which can greatly speed the development process.

One under-utilized implication of the Model-View-Controller Pattern is that very careful application of the pattern allows models to be reused in multiple contexts; for example, a model might be used as the backbone of both a Windows application and a web application (or theoretically, for a Windows application and a Macintosh application; this is currently rather difficult, although I am working to make it a practical reality). However, even when the models are reused, the views and controllers must be rewritten from scratch. In the case of the view, this is often not a terrible burden, especially when there are good tools for creating views on your platform of choice. On the other hand, rewriting a controller is tedious, boring work. Most of the time, a controller does little more than shuttle data between the view and the model and occasionally disable a control or two. You'd think that there'd be an easier way to hook everything up that didn't involve so much pain and suffering.

As it turns out, you'd be right.

A number of modern MVC frameworks provide a relatively new facility known as "bindings". The UI widgets in such frameworks provide you with the opportunity to specify data sources for various properties (for example, the text in a text field or the enabled state of a button). At runtime, the widgets sign up to receive notifications when the applicable properties on their data sources change (generally using something akin to the Observer pattern). If the data source changes its value for a property, the view will automagically catch this update and change its display accordingly. Likewise, when the user changes a value in the UI, the view propagates this change down to the model.

While this ability to bind directly to a model is incredibly useful and can greatly reduce the amount of code required for many applications, it also leaves us with a question: where do we put all that code that doesn't exactly fit in a model but that also cannot be adequately represented in the view (for example, the enabled state of a button)? One answer that has been proposed is Dan Crevier's DataModel-View-ViewModel pattern. In this pattern, the model (which is renamed to the "data model") and the view remain essentially unchanged, but the controller morphs into being just another model. However, rather than modeling the underlying data, a view model keeps track of application state.

When properly designed, a view model is much more unit testable than a controller, and it seems to me that it ought to be more portable between MVC frameworks, as well. Unfortunately, this pattern comes with a practical hurdle that must be overcome: sometimes, a desired behavior cannot be adequately modeled in a way that is usable by the view. It's often tempting to use this as an excuse to bake platform specific code into a view model, but I find this to be inelegant. In such cases, a developer has two options: either new UI widgets can be created (or extended) to understand the new behaviors, or a controller may be reintroduced in order to serve as a bridge between the view model and the view.

I'm fairly certain that each of these approaches has situations in which it is the best solution, but I strongly suspect that a new controller is the best choice in the majority of cases. While this method does add a fourth tier to the system, which will increase complexity, I believe that in most cases, it will ultimately be less work than baking new functionality right into a view toolkit.

I really wish I could come up with a snappy ending for this post, but it just isn't coming, tonight. In closing, if you're not using MVC in your designs right now (or if you're using it in an undisciplined fashion), I strongly recommend that you study up on it and see if it's right for you. If you're already using MVC, but you aren't using bindings, I'd suggest that you look into whether or not your framework supports them, as they can save you from quite a bit of tedious code.

Technorati Tags: ,

Tutorial for NSDuctTape now available

I've posted an introductory tutorial to NSDuctTape on Google Code.

Expect more in the not-too-distant future.

Technorati Tags: ,,,

Crazy Extention Methods: ToLazyList

Way back in November, I promised to show how to optimize my final version of GetPrimes. Today, I give you the solution to the problem:

using System;
using System.Collections.Generic;
using System.Linq;
 
namespace Utility
{
  public static class EnumerableUtility
  {
    public static IList<T> ToLazyList<T>(this IEnumerable<T> list)
    {
      return new LazyList<T>(list);
    }

    private class LazyList<T> : IList<T>, IDisposable
    {
      public LazyList(IEnumerable<T> list)
      {
        _enumerator = list.GetEnumerator();
        _isFinished = false;
        _cached = new List<T>();
      } 
      public T this[int index]
      {
        get
        {
          if (index < 0)
            throw new ArgumentOutOfRangeException("index"); 
          while (_cached.Count <= index && !_isFinished)
            GetNext();
          return _cached[index];
        }
        set
        {
          throw new NotSupportedException();
        }
      } 
      public int Count
      {
        get
        {
          Finish();
          return _cached.Count;
        }
      } 
      public IEnumerator<T> GetEnumerator()
      {
        int current = 0;
        while (current < _cached.Count || !_isFinished)
        {
          if (current == _cached.Count)
            GetNext();
          if (current != _cached.Count)
            yield return _cached[current];
          current++;
        }
      } 
      public void Dispose()
      {
        _enumerator.Dispose();
        _isFinished = true;
      } 
      public int IndexOf(T item)
      {
        int result = _cached.IndexOf(item);
        while (result == -1 && !_isFinished)
        {
          GetNext();
          if (_cached.Last().Equals(item))
            result = _cached.Count - 1;
        } 
        return result;
      } 
      public void Insert(int index, T item)
      {
        throw new NotSupportedException();
      } 
      public void RemoveAt(int index)
      {
        throw new NotSupportedException();
      } 
      public void Add(T item)
      {
        throw new NotSupportedException();
      } 
      public void Clear()
      {
        throw new NotSupportedException();
      } 
      public bool Contains(T item)
      {
        return IndexOf(item) != -1;
      } 
      public void CopyTo(T[] array, int arrayIndex)
      {
        foreach (var item in this)
          array[arrayIndex++] = item;
      } 
      public bool IsReadOnly
      {
        get { return true; }
      } 
      public bool Remove(T item)
      {
        throw new NotSupportedException();
      } 
      System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
      {
        return GetEnumerator();
      } 
      private void GetNext()
      {
        if (!_isFinished)
        {
          if (_enumerator.MoveNext())
          {
            _cached.Add(_enumerator.Current);
          }
          else
          {
            _isFinished = true;
            _enumerator.Dispose();
          }
        }
      } 
      private void Finish()
      {
        while (!_isFinished)
          GetNext();
      } 
      readonly List<T> _cached;
      readonly IEnumerator<T> _enumerator;
      bool _isFinished;
    }
  }
}

Essentially, what the above code does is to wrap an IEnumerable<T> in a layer that disguises it as an IList<T>. Any value that we evaluate is automatically cached for easy lookup later, but we also don't evaluate values until they are specifically demanded.

The implication of this is that you no longer have to choose between caching all of your values up front and evaluating them lazily&emdash;you can have both with relatively little overhead.

Returning to our example of computing primes, if we simply replace this line:

primes = knownPrimes.Concat(computedPrimes);

With this one:

primes = knownPrimes.Concat(computedPrimes).ToLazyList();

Then everything is suddenly fine, and we can compute primes at a very rapid rate.

Technorati Tags: ,,

NSDuctTape on Google Code

This is by no means a huge update, but NSDuctTape is now hosted on Google Code. You can download the source (or binaries) from there, and the code is also hosted on their Subversion servers.

Technorati Tags: ,,,

Announcing NSDuctTape

About a year ago, I started looking for a way to create applications for OS X using the .NET Framework. I found a couple of alternatives, such as Cocoa# and Dumbarton, but neither of them seemed to be quite what I was looking for.

Cocoa# is a project that aims, first and foremost, to create CLR wrappers for most of the classes in the Cocoa library and secondarily, to allow developers to craft classes that are capable of being accessed by Objective C runtime. At first, this sounded like exactly what I wanted--until I realized that Cocoa strongly encourages the use of the Model-View-Controller pattern in its applications, meaning that most of the code written for such an application would be classes that Objective C would be accessing. While the wrappers are cool, the overhead required to create a class that is consumable by the Objective C runtime makes it less than appealing.

Next, I turned to Dumbarton. However, I quickly turned away when I realized that applications using Dumbarton are required to host mono in such a way as to require the application to be released under the GPL [Update: Paolo Molaro, a significant contributor to the Mono project, informs me that this statement is inaccurate. In any case, I would still argue that the issue is confusing enough that many would shy away from using Dumbarton, even if there really is no licensing requirement].  I have nothing against using (or contributing to) GPLed software, but I don't really want to be tied down to it as soon as I begin developing an application.

What was I to do? I looked into contributing to Cocoa#, but its design philosophy is completely different from my own, and I quickly realized that if I wanted to be happy with the result, I would have to write my own library from scratch.

So I did.

I'm posting a very early release of my new library, NSDuctTape, on the website, today. My goal in designing the application was to remove as much friction as possible from the process of designing Model and Controller classes, with the understanding that most views will be defined using Apple's Interface Builder. Today's release supports Models pretty well, and it also supports bindings from Cocoa objects to CLR object properties, so Controllers aren't even always necessary.

I'll post a tutorial soon, but for the moment, you can download it or read more about it.

Stay tuned for more!

Technorati Tags: ,,,

Applications of Iterate()

Since the code snippets in my previous post consisted mostly of vague hand-waving, I thought it would be good to spend some time today showing how Iterate() can be useful in common algorithms. Consider a fairly standard prime number generator:

public static class Primes 
{
  public static IEnumerable<ulong> GetPrimes(ulong max)
  {
    var list = new List<ulong> { 2 };
    for (ulong candidate = 3; candidate < max; candidate += 2)
    {
      var sqrt = (ulong)Math.Sqrt(candidate);
      bool isPrime = true;
      foreach (var prime in list)
      {
        if (prime > sqrt)
          break;

        if (candidate % prime == 0)
        {
          isPrime = false;
          break;
        }
      }

      if (isPrime)
      list.Add(candidate);
    }

    return list;
  }
}

Surely, one could provide additional optimizations to this routine, but it will serve our purposes as it is. So where do we start? The easiest simplification is to get rid of the inner loop because it is already characterized as an operation on a collection. Let's begin by moving the first breaking condition out of the loop:

var smallPrimes = list.TakeWhile(n => n <= sqrt);
foreach (var prime in smallPrimes)
{
  if (candidate % prime == 0)
  {
    isPrime = false;
    break;
  }
}

Having done this, it now becomes clear that we are verifying a condition against all elements of a collection. We can rewrite this as:

bool isPrime = list.TakeWhile(n => n <= sqrt).All(n => candidate % n != 0);

Of course, now that we're only using isPrime in one place after its definition, we may as well not have it as a temporary variable:

for (ulong candidate = 3; candidate < max; candidate += 2)
{
  var sqrt = (ulong)Math.Sqrt(candidate);

  if (list.TakeWhile(n => n <= sqrt).All(n => candidate % n != 0))
    list.Add(candidate);
}

These are fine modifications, but they may have left you wondering when Iterate() becomes useful. To demonstrate this, let's begin by applying it in the most obvious place:

public static IEnumerable<ulong> GetPrimes(ulong max)
{
  var list = new List<ulong> { 2 };

  var candidates = EnumerableUtility.Iterate<ulong>(3, n => n <= max, n => n + 2);
  foreach (var candidate in candidates)
  {
    var sqrt = (ulong)Math.Sqrt(candidate);

    if (list.TakeWhile(n => n <= sqrt).All(n => candidate % n != 0))
      list.Add(candidate);
  }

  return list;
}

Well, that doesn't make our code clearer, shorter, or more efficient! However, having a foreach loop does make it clearer that we are performing a filter operation on the collection before actually doing anything with the values:

public static IEnumerable<ulong> GetPrimes(ulong max)
{
  var list = new List<ulong> { 2 };

  var candidates = EnumerableUtility.Iterate<ulong>(3, n => n <= max, n => n + 2);

  var primes = candidates.Where(
    candidate =>
    {
      var sqrt = (ulong)Math.Sqrt(candidate);
      return list.TakeWhile(n => n <= sqrt).All(n => candidate % n != 0);
    });

  foreach (var prime in primes)
    list.Add(prime);

  return list;
}

As an aside, if you're not yet comfortable with closures, you may be a bit unsettled by the ordering of statements in this snippet: it appears as if we are checking the list of primes before we have actually populated it! Rest assured that this is not the case: the Where() function, as it is implemented in System.Core.Enumerable, does not calculate the next value in its result until that value is actually requested. Because we never request a new value until we've added all previous values to the list, we have nothing to fear (note, on the other hand, that other implementations of Where(), such as the one included the upcoming ParallelFX library, may not share this implementation detail, so be sure you're using the right one!).

It would seem that we are near the end of possible refactorings for this routine. However, to believe this is to forget that we are not required to calculate all the values in the list ourselves! Getting rid of list, we finally arrive at this:

public static IEnumerable<ulong> GetPrimes(ulong max)
{
  var knownPrimes = new ulong[] { 2, 3 };

  var candidates = EnumerableUtility.Iterate<ulong>(5, n => n <= max, n => n + 2);

  IEnumerable<ulong> primes = null;
  var computedPrimes = candidates.Where(
    candidate =>
    {
      var sqrt = (ulong)Math.Sqrt(candidate);
      return primes.TakeWhile(n => n <= sqrt).All(n => candidate % n != 0);
    });

  primes = knownPrimes.Concat(computedPrimes);

  return primes;
}

Once again, we have a rather clever use of closures, but I hope this one isn't quite as jarring as the last one might have been. While it may at first appear that we are using primes before it has been assigned a real value, recall that the evaluation of Where() does not occur until values are requested from it, which no longer even happens in this function.

It is also perhaps worth noting that our list of known primes has expanded to include 3. The reason for this is that failure to do so would cause infinite recursion when attempting to access the first value in computedPrimes. If it doesn't seem obvious to you, try it and see for yourself :).

One significant benefit to this latest refactoring is that we now only calculate each prime upon demand. Because of this, we may as well compute all primes from 1 to ulong.MaxValue, and forget about the max parameter:

public static IEnumerable<ulong> GetPrimes()
{
  var knownPrimes = new ulong[] { 2, 3 };

  var candidates = EnumerableUtility.Iterate<ulong>(3, n => n != ulong.MaxValue, n => n + 2).Select(n => n + 2);

  IEnumerable<ulong> primes = null;
  var computedPrimes = candidates.Where(
    candidate =>
    {
      var sqrt = (ulong)Math.Sqrt(candidate);
      return primes.TakeWhile(n => n <= sqrt).All(n => candidate % n != 0);
    });

  primes = knownPrimes.Concat(computedPrimes);

  return primes;
}

Aside from removing max, our only significant change has been to candidates, which is now defined as all odd numbers from 5 to ulong.MaxValue (the trickiness with the call to Select() is to keep ourselves from trying to go past MaxValue).

While we now have some beautiful code, you may have noticed that the last two snippets run significantly more slowly than their fully imperative counterparts. The reason for this is that we are no longer caching our prime numbers, so whenever we have to iterate through the list, all primes must be recalculated. Does this mean that we are doomed? Certainly not! In fact, a very minor change to this code is all that is necessary, but the details will have to wait until the next post.

Beyond Loops

Lately, I've been reading quite a bit about the functional programming concepts that are finally coming to C#. The benefits of this new hybrid (imperative/functional) style are numerous, from greater readability (for those who grasp the new idioms, anyway) to more convenient parallelization. However, I've seen very little about the advantage that excites me most: eliminating loops.

What?!

Okay, I'm not really advocating that we actually stop writing code that contains loops—after all, the most reasonable alternative to loops is recursion, and recursion hurts my brain. I am instead suggesting that the time has come for most of our loops to be abstracted away, and here is my first attempt at hiding them:

public static class EnumerableUtility

{

  public static IEnumerable<T> Iterate<T>(T initial, Func<T, bool> fnContinue, Func<T, T> fnNext)

  {

    for (T current = initial; fnContinue(current); current = fnNext(current))

      yield return current;

  }

 

  public static void ApplyToAll<T>(this IEnumerable<T> list, Action<T> fnAction)

  {

    foreach (T item in list)

      fnAction(item);

  }

}

(Note: Did It With .NET defines a similar function, Sequence(), that operates similarly but specifies the arguments in a different order. I prefer my function, but mine could be replaced by his by calling Sequence(initial, fnContinue, fnNext). Either way, the effect is the same.)

Why is this so great? All I really did was put a couple loops into a function, right? Well…sort of, but if you'll stick with me, I think that you'll see that these functions have some significant advantages.

The Problem

In order to compute very much that is of interest to anyone, a language must support sequence, choice, and repetition. When taken in isolation, each of these concepts is easy to grasp, but when combined, they can very quickly exceed the limits of human comprehension. Personally, I find that a few big loops will tax my abilities faster than either of the other two structures, but I also find that repetition tends to be the least frequently abstracted of the three basic constructs.

New Paradigm

Let's look at some code:

var somethingOrOther = initialValue;

while (SomeCondition(somethingOrOther))

{

  if (SomeOtherCondition(somethingOrOther))

    DoStuff(somethingOrOther);

  somethingOrOther = GetNext(somethingOrOther);

}

Or its common counterpart:

for (var somethingOrOther = initialValue; SomeCondition(somethingOrOther); somethingOrOther = GetNext(somethingOrOther)))

{

  if (SomeOtherCondition(somethingOrOther))

    DoStuff(somethingOrOther);

}

The details and complexity vary, but I've seen code like this many times in my (admittedly short) career. We've been trained to accept such things as normal, but is this really as good as it gets? Here are some of the issues I see:

  • If the code is sufficiently complex, it becomes difficult to determine with certainty where somethingOrOther is assigned—especially if the original developer was undisciplined.
  • In the case of the first example, we have an iterator left outside the scope of the loop, which is just asking for trouble.
  • We're mixing repetition and choice, which will lead to confusion as the complexity of the loop grows.

However, the operations defined in EnumerableUtility allow us to think of this as a series of operations on a list, rather than as a loop:

var items = EnumerableUtility.Iterate(initialValue, SomeCondition, GetNext);

items = items.Where(SomeOtherCondition);

items.ApplyToAll(DoStuff);

How is this better? First of all, the second and third operations now appear as a single operation or a list, rather than as a series of operations or individual items—and I find this to be easier on my mental model. Second, we no longer have to concern ourselves about strange assignments to our iterator because the iterator itself has been abstracted away! Finally, having all of the primary operations (enumerating the values, filtering them, and performing an operation on the remaining values) separated from each other allows us to consider each in isolation, which is significantly easier on the brain.

Next time, I intend to introduce some common algorithms and show how using lists instead of loops can simplify their implementation.

And now for something completely different!

I just had to point out this bit of feedback for Visual Studio 2008 Beta 2. Did you notice the issue when you clicked through the license?

Subversion on the Macintosh

Prerequisites
  1. Move Apache 1.3 to Apache 2. This is actually much less painful that it sounds, but it's quite necessary because Subversion doesn't support anything less than Apache 2. You can try running multiple versions of Apache from one box if you like, but in my case, I found that it was easier to just make the switch. If we're lucky, Apple will put Apache 2 on Leopard, and we won't ever have to worry about this part again. Here's how you get Apache 2 on your machine:
    1. Run FinkCommander (you got that back when you were reading my first article on .NET development, right?)
    2. Install the apache2 package. That wasn't so hard, was it?
  2. If you're also serving up ASP.NET on this machine, rebuild mod_mono (discussed in part 2 of my .NET development series). When you run the configuration script, add --with-apxs=/sw/bin/apxs2 to the command line arguments.
    1. If you've installed a CruiseControl.NET dashboard (as discussed in part 3 of the .NET development series), then at the end of /sw/etc/apache2/httpd.conf, add:
Alias /ccnet "/web/ccnet"
AddMonoApplications default "/ccnet:/web/ccnet"
<Location /ccnet>
SetHandler mono
</Location>

Installing Subversion
  1. Run FinkCommander and install the libapache2-mod-svn package.
  2. Run sudo mkdir /svn, where /svn is the path where you want to store your repository.
  3. Run sudo svnadmin create /svn.
  4. Run sudo chown -R www /svn, where www is the name of the user the Apache uses.
  5. Edit /sw/etc/apache2/mods-enabled/dav_svn.conf; the comments should explain what you need to do here.
  6. Restart Apache (/sw/etc/sbin/apache2ctl -k restart).
At this point, you should have a pretty basic Subversion server set up at http://your-machine/svn. Of course, there are all sorts of other things that you can do to your repository, like set it up for secure access, put it on a different port, and what-have-you, but I haven't had need for any of these features (I'm running on a small, personal network). However, I would suspect that setting up these features would not vary as much from platform to platform as the initial setup procedure does, so try Google if you need that stuff.

Caveats

Sadly, I lost my notes on how to get Apache 2 to start up at boot. However, I did find this forum posting that suggests that it can be accomplished by replacing the Apache 1.3 startup scripts with links to the Apache 2 scripts.

I've also found that sometimes, even though I have Apache 2 set up to start up at boot, it fails to do so, or CruiseControl.NET fails to start (or eventually locks up). However, this happens infrequently enough that I haven't bothered to track down the root of the problem, and a reboot (or two) generally fixes it. If anyone reading this happens to stumble upon a solution to my problem, I'd love to hear about it.