Sound Code: May 2008

Friday 30 May 2008

Live Mesh First Impressions

I was very interested when I first heard of the Live Mesh product that Microsoft are building, as it looks like it might solve some of the problems I have working on three separate computers (see my previous post on Live Mesh here). It's very rare indeed that I would say this, but on this occasion, Joel Spolsky just doesn't get it.

I signed up at the Live Mesh site to be part of the beta test, and my invitation came through last week, so I installed Live Mesh on each of the three computers, which went smoothly.

Remote Desktop

The first thing I tried was to connect using the remote desktop feature, which worked well although it got the aspect ratio of my laptop screen wrong, and the speed wasn't that great. I eventually crashed FireFox though.

I tried again while I was at work to connect back to one of my home PCs, but it told me that the PC was in use (presumably my wife or children), so I am not sure what the rules for it letting you access are.

One slight downside to the remote desktop is that it does require you to install an ActiveX control, meaning that if you were to be on someone else's PC and wanted to connect back to your home desktop, you would need to install the ActiveX control on their PC.

Folder Sharing

The other key feature I wanted to try out was folder sharing. It is trivially easy to set a shared folder up - simply right-click it and choose "Add folder to your Live Mesh" from the context menu. However, I immediately ran into some limitations. I couldn't share my pictures or my music because currently you must share all folders into your space in the cloud, of which you are limited 5GB. Microsoft have hinted that future versions will allow you to share just between computers without requiring online storage too.

So I decided my test would be to share a folder of source code on a project I am working on. This would allow me to browse and potentially work on the code from other PCs. The project I chose had quite a large source code folder, which ran to just over 1GB. It uploaded to the cloud remarkably quickly, but very soon I was thinking of additional features Live Mesh needs if it is to be really useful.

I want to exclude certain file types and sub-folders within a synchronized folder. For example, the contents of bin and obj folders are not needed, and one of my subfolders contained a lot of unnecessary data files that I didn't want uploaded.
When I did a build, so many files in the synchronized folder were changing that the Live Mesh application was using a lot of CPU (my work PC is very underpowered). I could do with a simple way of temporarily turning off sync and turning it back on when I am finished making major changes. This would be a must if I used Live Mesh to share folders for Digital Audio Workstation projects. I would not want it trying to synchronize while I was recording. Update: I have just noticed that the Live Mesh client has a "work offline" option which would probably suffice for most scenarios.
On my home PCs, I would like to be able to browse the code in this project, but I don't really want to be modifying it (for one thing it is a VS2005 project and my home PC only has VS2008 installed - it would be a disaster if I inadvertently upgraded it!). It would be great if I could set up a folder as a read-only synchronized copy. This might also be useful for music or photo sharing in some instances.

Concerns

As I mentioned in my earlier post, one of my concerns is how much of my monthly broadband allowance Live Mesh will eat up. The ability to synchronize between locally networked devices without going via the Internet would help a lot.

I am also beginning to wonder whether Windows Live OneCare's backup has the intelligence not to back up the same files twice from two PCs in my OneCare circle. Because if I started sharing photos and music, I wouldn't want to start doubling my backup disk space requirements (my USB backup hard disk is getting quite full).

Conclusion

Live Mesh has a lot of promise. If MS can address some of these limitations I have mentioned then I can see the Live Mesh client being a must install on all the PCs I work use.

The fact that it is all or nothing when synchronizing a folder is a little awkward. It means that I would have to either share loads of smaller folders, or radically reorganize my folder structures on my PC so that all things I want to share (e.g. source code projects, documents etc) go in one folder, and the things I don't want to share go in another.

I'm really looking forward though to seeing what features Microsoft will add to Live Mesh, and I've signed up already to try out the SDK when it becomes available.

Thursday 29 May 2008

Wanted Language Feature: Reinterpret Cast of Byte Arrays

I am a huge fan of C#, but one of the most frustrating things about it is dealing with byte arrays which actually represent some other type of data. For example, suppose I have an array of bytes that I know actually contains some floating point numbers. What I would like to be able to do is:

byte[] blah = new byte[1024];
float[] flah = (float[])blah;

But of course, this won't compile. There are two options:

1. Create a new array of floats and copy the contents of the byte array into it, using the BitConverter.ToSingle method. I could then access the contents as floats. The disadvantages are obvious. It requires twice the memory, and copying it across is not free. Also if I modify any values, they may need to be copied back into the original byte array.

2. Using the unsafe and fixed keywords, pin the byte array where it is and obtain a float pointer. The disadvantages are obvious. First, pinning objects interferes with the garbage collector, reducing performance (and performance is often exactly what you want when you are dealing with arrays of numbers), and second, as the keyword suggests, pointers are unsafe. Here's some example code from my open source audio library NAudio that shows me using this method to mix some audio:

unsafe void Sum32BitAudio(byte[] destBuffer, int offset, byte[] sourceBuffer, int bytesRead)
{
    fixed (byte* pDestBuffer = &destBuffer[offset],
              pSourceBuffer = &sourceBuffer[0])
    {
        float* pfDestBuffer = (float*)pDestBuffer;
        float* pfReadBuffer = (float*)pSourceBuffer;
        int samplesRead = bytesRead / 4;
        for (int n = 0; n < samplesRead; n++)
        {
            pfDestBuffer[n] += (pfReadBuffer[n] * volume);
        }
    }
}

But does it really need to be this way? Why can't the .NET framework let me consider a byte array to be a float array, without the need for copying, pinning or unsafe code? I've tried to think through whether there would be any showstoppers for a feature like this being added...

1. The garbage collector shouldn't need any extra knowledge. The float array reference would be just like having another byte array reference, and the garbage collector would know not to delete it until all references were gone. It could be moved around in memory if necessary without causing problems.

2. Sizing need not be an issue. If my byte array is not an exact multiple of four bytes in length, then the corresponding float array would simply have a length as large as possible.

3. This would only work for value types which themselves only contained value types. Casting an array of bytes to any type that contained a reference type would of course be unsafe and allow you to corrupt pointers. But there is nothing unsafe about casting say an array of bytes into an array of DateTimes. The worst that could happen would be to create invalid DateTime objects.

The benefits of adding this as a language feature would go beyond simply playing with numbers. It would be ideal for interop scenarios, removing the need for Marshal.PtrToStructure in many cases. Imagine being able to write code like the following:

byte[] blah = new byte[1024];
int x = MyExternalDllFunction(blah);
if (x == 0)
{
    MyStructType myStruct = (MyStructType)blah;
}
else
{
    MyOtherStructType myOtherStruct = (MyOtherStructType)blah;
}

What do you think? Would you use this feature if it was in C#? It needn't be implemented as a cast. It could be a library function. But the key thing would be to create two different struct or array of struct types that provided views onto the same block of managed memory.

Wednesday 28 May 2008

Combining Paths in XAML for Silverlight

I have been attempting to make some nice looking buttons to use with my Silverlight Audio Player, and came up with a basic design in XAML, borrowing some colour ideas from iTweek's icons on deviantart. The XAML for the basic play button with drop shadow is:

<Ellipse Canvas.Left="53" Canvas.Top="76" Width="26" Height="8" Fill="#40000000" />
<Ellipse Canvas.Left="50" Canvas.Top="50" Stroke="#396C15" Width="32" Height="32">  
  <Ellipse.Fill>
    <LinearGradientBrush EndPoint="0,1">
      <GradientStop Color="#1070B434" Offset="0.0" />
      <GradientStop Color="#70B434" Offset="0.5" />
      <GradientStop Color="#FFAFD855" Offset="1.0" />
    </LinearGradientBrush>      
  </Ellipse.Fill>
</Ellipse>
<Path Canvas.Left="50" Fill="#FFFFFF" Stroke="#396C15" Data="M 13,58 l 10,8 l -10,8 Z" StrokeLineJoin="Round" />

Things got a bit more interesting when I tried to make a fast-forward icon by overlapping two triangles. Simply drawing two triangles over the top of each other does not give the desired result:

<Path Canvas.Left="80" Fill="#FFC0C0" Stroke="#396C15" Data="M 13,58 l 10,8 l -10,8 Z "
  StrokeLineJoin="Round"  />
<Path Canvas.Left="80" Fill="#FFC0C0" Stroke="#396C15" Data="M 18,58 l 10,8 l -10,8 Z " 
  StrokeLineJoin="Round"  />

The Path Data property allows you to specify more than one closed shape, but this results in the strokes of both triangles being visible rather than combining to form one outline. This happens whether you use the Path syntax or create a GeometryGroup and add PathGeometrys. Rather annoyingly PathGeometry does not have a Data Property, which makes for some cumbersome XAML:

<Path Canvas.Left="105" Fill="#FFC0FF" Stroke="#396C15" Data="F 1 M 13,58 l 10,8 l -10,8 Z M 18,58 l 10,8 l -10,8 Z"
  StrokeLineJoin="Round"  />

<Path Canvas.Left="130" Fill="#FFFFC0" Stroke="#396C15" Data="M 13,58 l 10,8 l -10,8 Z M 18,58 l 10,8 l -10,8 Z" 
  StrokeLineJoin="Round"  />

<Path Canvas.Left="155" Fill="#C0FFC0" Stroke="#396C15" StrokeLineJoin="Round">
  <Path.Data>
    <GeometryGroup FillRule="NonZero">
        <PathGeometry>
          <PathFigure StartPoint="13,58" IsClosed="True">
            <LineSegment Point="23,66"  />
            <LineSegment Point="13,74"  />
          </PathFigure>
        </PathGeometry>
        <PathGeometry>
        <PathFigure StartPoint="18,58" IsClosed="True" >
            <LineSegment Point="28,66"  />
            <LineSegment Point="18,74"  />
          </PathFigure>
        </PathGeometry>
    </GeometryGroup>                  
  </Path.Data>
</Path>

This is still not the effect I want, irrespective of the FillRule used. The solution in WPF is to use a CombinedGeometry. This allows two PathGeometrys to be specified that can be combined as a union to create one shape. Again we have very verbose XAML because we can't use the Path mini language:

<Path Canvas.Left="180" Fill="#C0C0FF" Stroke="#396C15" StrokeLineJoin="Round">
  <Path.Data>
    <CombinedGeometry GeometryCombineMode="Union">
      <CombinedGeometry.Geometry1>
        <PathGeometry>
          <PathFigure StartPoint="13,58" IsClosed="True">
            <LineSegment Point="23,66"  />
            <LineSegment Point="13,74"  />
          </PathFigure>
        </PathGeometry>
      </CombinedGeometry.Geometry1>
      <CombinedGeometry.Geometry2>
        <PathGeometry>
          <PathFigure StartPoint="18,58" IsClosed="True" >
            <LineSegment Point="28,66"  />
            <LineSegment Point="18,74"  />
          </PathFigure>
        </PathGeometry>
      </CombinedGeometry.Geometry2>
    </CombinedGeometry>
  </Path.Data>
</Path>

This produces the shape I wanted:

But now we run into another problem. The Silverlight 2 beta does not support CombinedGeometry, and I have no idea if this is going to be supported as part of the full release of Silverlight 2.

So how can we get this in Silverlight? At the moment I know of only two solutions:

1. Do the maths yourself. For two triangles as in my example, this wouldn't be too hard, but it would be a real pain if your combined shape involved any curves or ellipses.

2. Get Expression Blend to do it for you. Draw the two shapes, select them, then select Object | Combine | Unite. This will create a single path you can use in a Silverlight application. It doesn't leave the XAML quite how I would like it (adding margins and using absolute coordinates in the path, rather than relative).

<Path Height="17" 
  HorizontalAlignment="Left" 
  Margin="138.5,91.5,0,0" 
  VerticalAlignment="Top" 
  Width="16" Fill="#C0C0FF" Stretch="Fill" 
  Stroke="DarkGreen" 
  StrokeLineJoin="Round"
  Data="M0.5,0.5 L5.5,4.5 L5.5,0.5 L15.5,8.5 L5.5,16.5 L5.5,12.5 L0.5,16.5 L0.5,0.5 z"/>

Friday 23 May 2008

Vista's Most Unhelpful Dialog

Anyone have any idea how I even begin to track down what the phantom program that is 'using' my USB stick is? How should I know which of the 76 generically named processes on my PC "might" be using the device?

Friday 16 May 2008

Primary Keys and URLs

About a year ago, I blogged about the relative merits of using integers or GUIDs as primary keys in databases. The particular cause for concern was what the URLs would look like:

www.mysite.com/View.aspx?id=104

which looks much nicer than

www.mywebsite.com/View.aspx?id=015B34D5-A301-4543-AE1A-16708B19F602

With the advent of ASP.NET MVC, the choice becomes something like:

www.mysite.com/product/104

www.mywebsite.com/product/015B34D5-A301-4543-AE1A-16708B19F602

It was because of the consideration of what the URL would look like that I chose to stick with integers for the database I was using at the time.

But the more I have thought about this problem, the more I have come to the conclusion that the database row identifier should never appear in the URL anyway. After all, if you are displaying products, they will also have a unique product code. If you are displaying user details, they will have a user name. If you are displaying categories, they will have a unique name. Even with a blog or CMS system, the trend now is to have a "slug" to give a unique string for each post to use in the "permalink", often combined with the year and month to help avoid naming conflicts. So for example:

www.mysite.com/product/X1494-M

www.mysite.com/category/programming

www.mysite.com/user/mheath

www.mysite.com/blog/2008/5/welcome-to-my-new-site

If you allow these "slug" fields to be editable by the user, you can use genuinely meaningful URLs, without ever needing to reveal the database key to the user.

So I'm leaning towards using GUIDs again. The only case where I think I would need them to appear in a URL is in admin specific links such as editing or deleting a blog entry, where it would be safer to use them just in case the slug was not unique. They may also be useful to form really permanent permalinks for the cases where the slug itself may be changed in the future.

Thursday 8 May 2008

Google versus Microsoft Documents Offline Access

I started using Google Docs a couple of years ago, and quickly it became my number one place to keep the documents that previously came everywhere with me in my pocket on a USB drive. However, after a few experiences where I urgently needed to access a document while my internet connection was down, I revised my approach and reverted to the USB stick.

However, this week I noticed that Google have announced offline access for Google Docs, provided you install Google Gears. And while the user interface is still nowhere near the standard of Microsoft Word, it is steadily improving, and for most documents I write is more than sufficient. If they were to create a Flash or Silverlight interface, they could potentially come up with a serious rival to Word.

And what have Microsoft been doing all this time? Well sadly, they are lagging behind. When Live Mesh eventually shows up, we will be able to access our Word documents from all the computers we use, which is great. But you will still need Office installed on those PCs.

And then there is Office Live Workspace which already offers a decent amount of online storage for Office documents, but as far as I can tell, has no offline support, and again you absolutely must have Office installed to edit the documents. There doesn't even seem to be an easy way of printing either. So for now Google Docs is definitely in the lead for those documents that don't need advanced formatting and layout features (which is most of my personal documents).

Now understandably, Microsoft may be reluctant to give a free version of Word away in the browser. But what if Microsoft allowed us to associate a licensed copy of Office with a Windows Live ID? That way they could give access to a basic set of in-browser editor features (say using Silverlight) to allow editing and printing of documents from anywhere on the web. They could even let us use up one of the three licenses you get with Office Home and Student edition to ensure they get a revenue stream from this. If they don't come up with a good way of working on documents in the browser before too long, they could end up losing a lot of ground to Google, just like IE has done to FireFox.

And one last thing ... why doesn't Office Live Workspace allow a OneNote notebook to be created in the cloud? That would be awesome.

Using ASP.NET MVC HtmlHelper.Form

Today, I tried making a HTML form with the ASP.NET MVC HtmlHelper, but it took me a while to get it working as documentation on the web is a little sparse at the moment. Here's some basic instructions.

View

In the view, you can use a using statement to ensure the Form tag is closed properly. I am using the generic version of the HtmlHelper.Form method, so that I can easily specify a method on my controller to be called when the form submit button is clicked. This method can take parameters if you want (for example the ID when creating an edit form), but it should not have parameters for the actual input fields on the form - they will be read out of the Request object by the controller.

<h2>New Post</h2>
<% using (Html.Form<BlogController>(c => c.CreatePost()))
  { %>
<label for="postTitle">Title</label>
<%= Html.TextBox("postTitle") %>
<br />
<%= Html.TextArea("postBody","", 10,80) %>
<%= Html.SubmitButton("submitButton","Save") %>
<% 
  } %>

Controller

The controller method, as I have already indicated does not need to take the form inputs as parameters. It reads these out of the Request.Form dictionary.

public ActionResult CreatePost()
{
   string postTitle = Request.Form["postTitle"];
   string postBody = Request.Form["postBody"];   
   Post post = new Post { Title = postTitle, Body = postBody };
   post.Status = PublishStatus.Published; 
   blogService.CreatePost(post);
   return RedirectToAction(new { action = "Index" });
}

There is a Binding.UpdateFrom helper method that you can use to speed up the code further if you name your fields correctly.

Testing

It's fairly easy to test your controller's new method. Simply add references to System.Web and System.Web.Abstractions and you can populate the Request.Form dictionary before calling your action method:

[Test]
public void BlogControllerCanCallCreatePost()
{
   BlogController blogController = new BlogController(new TestBlogRepository());
   blogController.Request.Form["postTitle"] = "Title";
   blogController.Request.Form["postBody"] = "Body";
   ActionResult result = blogController.CreatePost();
}

EDIT: Looks like I spoke too soon. The above test code will actually fail because blogController.Request is null. I'll update this post once I have worked out an easy way to populate the Request.Form dictionary. And in future I'll try to remember to actually run my unit tests before declaring them a success!

Thursday 1 May 2008

The Great Refactoring Debacle

The Vision

I have been giving some technical seminars to the rest of the development team at my workplace every few months, trying to keep them up to date with good coding practices and the latest technologies. While a few of them are already quite interested in these topics, on the whole most are simply content to plough on with the knowledge they have and only learn new stuff when an obstacle confronts them. But on the whole, my presentations have been well received.

It was a few months ago that I brought up the subject of refactoring. Twenty developers all coding away merrily on the same project for two years things had resulted in some less than ideal code. It was not as if the product was low quality. Far from it. Our bug count was low and the customers who were using the product hadn't escalated any major issues to the development team.

But there was a growing awareness that the codebase was getting a little out of hand, and management agreed. We were going to spend a few months simply working on the quality of the code, ready for a big "general release". No new features, just bug fixing and "refactoring".

The Strategy

This was a "once in a blue moon" opportunity to right some of the wrongs from earlier phases of the project. I presented my opinions on where I thought our efforts would be best spent during our two months of refactoring. We would focus on code smells, breaking huge classes and methods down into smaller pieces that did one thing and did it well.

At the same time, the more daring developers would try to eradicate some of the very tight coupling that had crept in, and to separate the GUI from business logic. Without these two advances, any hopes of running automated tests would be merely a pipe dream.

Finally, there were some classes that were being used in multiple places resulting in it being far too easy to inadvertently break someone else's code. I proposed to separate this code off to form some new extensible components that could be customised and used in different ways by different parts of the application.

The Refactoring

So we got to work. The build-breaking changes went first. Thousands of lines of code were changed. Developers were delighted as they successfully turned previously intimidating classes into groups of much more manageable code. Some of the worst-designed features of the application gradually morphed into a more streamlined and extensible framework.

There wasn't time for everything. In fact, some of my top recommendations had to be left to one side. And other ideas just proved too difficult. For example, I tried to extract some groups of Windows Forms controls from a Panel containing thousands of controls, but the process was just too fiddly (if only one of the refactoring tools could do this).

The bug count dropped dramatically. In fact, it dropped so fast that most of the development were seconded to test to help them catch up. All was looking good.

The Aftermath

Then came the system test. The first "full" system test in over a year. And there were lots of new bugs found. On the whole we managed to stay on top of them, but as the release date grew closer and closer, some "showstopper" bugs came out of the woodwork. The release date slipped three or four times and ended up coming a few weeks late. The overrun wasn't huge by comparison with many other projects, but an explanation was needed.

I wasn't in the meetings where blame was apportioned. In fact I was on holiday that week, but it was clear when I returned that a scapegoat had been identified. Refactoring had caused us grief, and perhaps we shouldn't have done it. On the surface of things, it was the refactoring that caused some of the very defects that delayed the release.

Now I could get on my high horse and demand a chance to stand up for "refactoring". I could point out that...

It's not really refactoring if you don't run unit tests (because you don't have any unit tests). It's simply re-architecting which is very risky.
Refactoring is always best done on code that you are actively working on. You understand what it is doing and why, and you have already allocated some time to testing it. Diving into a class and refactoring it simply because it is "big" is also risky (especially if you don't have any unit tests).
Refactoring is primarily about making small, incremental changes. Over a long period of time the structure and design of the code should improve. Trying to do it all in one hit is risky.
Refactoring does introduce bugs from time to time, because all modification to code risks introducing bugs. That is equally true whenever you add features or fix bugs. Indeed, it could be argued that the new features that "sneaked in" to the refactoring phase were the cause of a lot of our problems.
The main rewards of refactoring are felt in the future. For example, new features can be added much more quickly. Bugs can be found and fixed much more easily. In fact, my team is already reaping the fruits of the refactoring in the new feature we are adding, but this kind of benefit is not highly "visible" to management.

... well I could point these things out, but I suspect it would just come across as whining because my idea of "refactoring" apparently didn't "work". And if I say too much I risk someone deciding that refactoring should be "banned" because it is clearly too dangerous.

So I'll keep my mouth shut, and continue to improve the codebase bit by bit, sticking to the code I am already working on, and adding unit tests as I go. Management won't credit any resulting quality enhancements or development speed increases to the discredited "refactoring" idea, but at the end of the day, much of my job satisfaction comes from the knowledge that I have contributed to a quality project.

I suppose the bottom line is that the benefits of refactoring are not easily measurable. Bug counts are measurable. Completing features is measurable. But what is not measurable is how much refactoring has contributed to the speed at which bugs are fixed and new features are added. And as the saying goes, "if you can't measure it, it doesn't exist".

Technorati Tags: refactoring,unit testing