I was recently working on a Windows Metro app, where we needed to capture a user’s scribble on an image background as a composite bitmap. The scenario was this :

  1. Display a bitmap on screen.
  2. Let the user scribble something on the image using a pen, mouse or touch.
  3. Display the scribble while it is being made.
  4. Let the user save a new bitmap with the scribble overlayed on top of the original.

1 through 3 were fairly easy to implement. In my case, a canvas with a bitmap background, handling pointer events and drawing some lines based on pointer positions being reported through the WinRT’s pointer API’s – that was it. It was #4 that required some work.

But before we get there – let’s take a look at the easy parts (#1,#2 and #3) . The XAML is trivial.

<Grid>
  <Grid.RowDefinitions>
    <RowDefinition Height="*"/>
    <RowDefinition Height="Auto"/>
  </Grid.RowDefinitions>
  <Canvas Height="535" Width="800" x:Name="inkCanvas"
            PointerMoved="inkCanvas_PointerMoved_1" 
            PointerPressed="inkCanvas_PointerPressed_1" 
            PointerReleased="inkCanvas_PointerReleased_1">
    <Canvas.Background>
      <ImageBrush ImageSource="/Assets/ChichenItza.jpg"/>
    </Canvas.Background>
  </Canvas>
  <Button x:Name="btnSave" Content="Save" 
          Tapped="btnSave_Tapped_1" Grid.Row="1" 
          HorizontalAlignment="Center" Margin="0,0,0,75"/>
</Grid>
 

The canvas with the ImageBrush as the background served as the “scribble” area. To keep things simple for this post, I just included the image in the app as a resource. The Button when tapped, initiates code to save the composite bitmap off.

Point _previousPosition = default(Point);
SolidColorBrush _stroke = new SolidColorBrush(Colors.Red);
bool _pressed = false;

private void inkCanvas_PointerPressed_1(object sender,
  PointerRoutedEventArgs e)
{
  _previousPosition = e.GetCurrentPoint(inkCanvas).Position;  
  _pressed = true;
}

private void inkCanvas_PointerReleased_1(object sender,
  PointerRoutedEventArgs e)
{
  if (_pressed)
    _pressed = false;
}

private void inkCanvas_PointerMoved_1(object sender,
  PointerRoutedEventArgs e)
{
  if (!_pressed) return;
  var positions = e.GetIntermediatePoints(inkCanvas).
    Select(ppt => ppt.Position);

  foreach (Point pt in positions)
  {
        
    inkCanvas.Children.Add(
      new Line() { X1 = _previousPosition.X, 
        Y1 = _previousPosition.Y, 
        X2 = pt.X, 
        Y2 = pt.Y, 
        Stroke = _stroke, 
        StrokeThickness = 2 } 
      );
    _previousPosition = pt;
  } 
}

Listed above is all the code we need to track the pointer and render the scribble. I start tracking every time the pointer is pressed down, and stop every time the pointer is released – keeping track using the _pressed variable. While the pointer is pressed, for each PointerMoved event,  I get all the positions that the pointer has traversed through between the last occurrence of this event and this one(reported through the GetIntermediatePoints() API), and draw a Line connecting each pair of positions starting with the last position that the pointer was at before this PointerMoved event. The _previousPosition variable keeps a track of the last position. In case you are wondering – GetIntermediatePoints() returns a collection of PointerPoint instances – so the Select() is to project them to a collection of Point instances.

Image below shows what’s rendered on screen.

image

Now on to the real issue at hand(#4) -  how to save this as a single bitmap.

Before we dive into the API’s let’s talk a little bit about bitmaps. A bitmap is essentially a matrix of pixels – so a  bitmap with a width of 400 and a height of 300 can be thought of as a matrix with 400 columns and 300 rows with each cell(row/column intersection) representing a pixel. Each pixel in turn represents a resultant color as some combination of the three basic colors (Blue, Green and Red) and an alpha channel(transparency) value. So in a 8 bit byte representation, each pixel can be thought of as a 4 byte integer with each of the 3 bytes representing a color value (from 0 – 255) and the 4th byte representing alpha (again 0 – 255). The order of these bytes can vary – so the bytes in a pixel could for instance be in the  BGRA order or the RGBA order – something you need to take into account when manipulating images.

Another thing to be aware of is that the pixel data in a bitmap can often be represented as a flattened array of pixels. The way this works in WinRT is by taking each row of pixels (starting with the topmost) and concatenating them together. So if we had a 3 x 3 bitmap (yes that’s mostly useless Smile) like so:

P11

P12

P13

P21

P22

P23

P31

P32

P33

 

when flattened, the array would be [ P11,P12,P13,P21,P22,P23,P31,P32,P33].

Armed with this knowledge, I applied the following basic thinking to solve the problem – record all the positions that the pointer traverses. When the user tries to save, get the raw pixels for the background image, change the pixels that match the coordinates for the recorded pointer positions to a color value of your choice (like the red stroke I used to render) and then save the resultant bitmap – and presto !

For raw pixel manipulation, WinRT provides a BitmapDecoder and a BitmapEncoder type that prove out to be very useful. BitmapDecoder can take a bitmap and provide you with the array of pixels (flattened like we discussed before) for the bitmap, BitmapEncoder can accept an array of pixels and save it out as a bitmap and both types can work with most of the industry standard image compression schemes (JPEG, PNG, TIFF, BMP etc.).

[NOTE : These types have a fairly rich API and can be useful in many ways when it comes to image handling. I will only discuss the portions pertaining to this post – I do intend to detail their full functionality out in a future post soon. In the meantime, do read about them at the Windows 8 Dev Center].

 

So I changed my code (changes in bold) as below:

Point _previousPosition = default(Point);
SolidColorBrush _stroke = new SolidColorBrush(Colors.Red);
List<List<Point>> _allPoints = new List<List<Point>>();
bool _pressed = false;

private void inkCanvas_PointerMoved_1(object sender, 
  PointerRoutedEventArgs e)
{
  if (!_pressed) return;
  var positions = e.GetIntermediatePoints(inkCanvas).
    Select(ppt => ppt.Position);
  foreach (Point pt in positions)
  {
         
    inkCanvas.Children.Add(
      new Line() { 
        X1 = _previousPosition.X, 
        Y1 = _previousPosition.Y, 
        X2 = pt.X, Y2 = pt.Y, 
        Stroke = _stroke, 
        StrokeThickness = 2 } 
      );
    _previousPosition = pt;
  }
  _allPoints.Last().AddRange(positions);
}

private void inkCanvas_PointerPressed_1(object sender, 
  PointerRoutedEventArgs e)
{
  _previousPosition = e.GetCurrentPoint(inkCanvas).Position;
  _allPoints.Add(new List<Point>());
  _allPoints.Last().Add(_previousPosition);

      
  _pressed = true;
}

private async void inkCanvas_PointerReleased_1(object sender, 
  PointerRoutedEventArgs e)
{
  if (_pressed)
    _pressed = false;
}

private async void SaveBitmapAsync()
{
  List<byte> _pixelData = null;
  //convert the source bitmap to a pixel array
  var srcfile = await StorageFile.GetFileFromApplicationUriAsync(
    new Uri("ms-appx:///assets/ChichenItza.jpg"));

  BitmapDecoder decoder = null;
  using (IRandomAccessStream stm = 
    await srcfile.OpenAsync(FileAccessMode.Read))
  {
    decoder = await BitmapDecoder.CreateAsync(
      BitmapDecoder.JpegDecoderId, stm);
    _pixelData = (await decoder.GetPixelDataAsync()).
        DetachPixelData().ToList();
  }; 

  //do our pixel manipulation here

  //write it back
  var destfile = await KnownFolders.PicturesLibrary.CreateFileAsync(
    string.Format("{0}_signature.jpg", 
    DateTimeOffset.Now.ToString("MM_dd_yy_hh_mm_ss")), 
    CreationCollisionOption.ReplaceExisting);
  using (IRandomAccessStream stm = 
    await destfile.OpenAsync(FileAccessMode.ReadWrite))
  {
    BitmapEncoder encoder = await BitmapEncoder.CreateAsync(
      BitmapEncoder.JpegEncoderId, stm);
    encoder.SetPixelData(
      decoder.BitmapPixelFormat, 
      BitmapAlphaMode.Straight, 
      800, 535, decoder.DpiX, decoder.DpiY, 
      _pixelData.ToArray());

    await encoder.FlushAsync();
  };
}
 
private async void btnSave_Tapped_1(object sender, TappedRoutedEventArgs e)
{
  await SaveBitmapAsync();
} 

Every time the user lifts the pointer we can consider a “stroke” to be completed. So as the pointer is pressed I create a new List<Point> and add it to the tail of a List<List<Point>> representing my collection of strokes, and as the pointer  moves, I record the positions in the newly created list.

The SaveBitmapAsync() method is what gets invoked when the user taps the Save button. In there I start by creating a new BitmapDecoder (note the static creation method), load the image StorageFile (stored as a resource in my app), open the file to a random access stream, and use the decoder to get the pixel array (call GetPixelDataAsync() that returns a PixelDataProvider which in turn returns the flattened pixel array through a call to DetachPixelData()).   Once my pixel manipulation is done, I create a BitmapEncoder (again a static factory method to create), create a file (name it with the current date/time to avoid name collision) in the local Pictures folder, and save the manipulated pixel array into the file (actually a random access writeable stream obtained from the file) using the same pixel format and DPI settings as obtained for the original image via the decoder.

Now all that was left was to color the pixels based on my recorded pointer positions. I will not bore you with all my trials here, but I soon realized that there was an error in my basic thinking. Recording the pointer positions and applying them was not enough – since the WinRT pointer API’s do not report every pixel your pointer was traversing through ( and for good reasons since a) the UI thread will simply not be able to keep up with it and b) you simply do not need that kind of accuracy for any app specific input targeting). The image below shows the result of this erroneous thinking, in the faint dotted representation of my original scribble – since I only plotted the recorded pointer positions.

07_13_12_11_04_57_signature 

 

If you are wondering why we do not see this in the first image – recall that when rendering on screen, we actually join the recorded positions with a Line – so any gaps are covered. So that brings me to the last piece of the puzzle. I needed to do in saving what I was doing in rendering(by drawing those lines). To do that I needed to interpolate extra points between each pair of recorded points to get enough “density” such that the scribble no longer looked that disjointed, and I decided to use linear interpolation.

[Note : Methods of interpolation used in the general problem of curve fitting vary in degrees of error (since they are in fact all approximations), and linear interpolation is considered to be a relatively less accurate method. However it is the easiest to implement, and since the difference between the boundary points that we were interpolating for were very small to begin with, linear interpolation seemed to work well in my case. You may consider advanced Interpolation methods (like spline interpolation) if you feel like being mathematically adventurous. ]

 

private void Interpolate(Point pFrom, Point pTo, 
  ref List<Point> results)
{
  var xDiff = Math.Abs((pTo.X - pFrom.X));
  var yDiff = Math.Abs((pTo.Y - pFrom.Y));

  if (xDiff < 1 && yDiff < 1) return; //we stop 

  if (yDiff > xDiff) //interpolate along the x-axis
  {
    Point p0 = pFrom.Y < pTo.Y ? pFrom : pTo;
    Point p1 = pFrom.Y < pTo.Y ? pTo : pFrom;

    double y = (yDiff / 2) + Math.Min(pFrom.Y, pTo.Y);
    Point newPt = new Point(p0.X + (y - p0.Y) * 
      ((p1.X - p0.X) / (p1.Y - p0.Y)), y);
    results.Add(newPt);
    Interpolate(pFrom, newPt, ref results);
    Interpolate(newPt, pTo, ref results);
  }
  else //interpolate along y-axis
  {
    Point p0 = pFrom.X < pTo.X ? pFrom : pTo;
    Point p1 = pFrom.X < pTo.X ? pTo : pFrom;

    double x = (xDiff / 2) + Math.Min(pFrom.X, pTo.X);
    Point newPt = new Point(x, p0.Y + (x - p0.X) * 
      ((p1.Y - p0.Y) / (p1.X - p0.X)));
    results.Add(newPt);
    Interpolate(pFrom, newPt, ref results);
    Interpolate(newPt, pTo, ref results);
  }
}

The Interpolate() method takes in two points and recursively interpolates points in between. At each recursion level it checks the distance between the points, and interpolates along the axis that will result in the largest sample size. It also stops recursion when the points in question have a difference of less than 1 logical pixel along both coordinates. To understand this better, please visit the hyperlink earlier on linear interpolation.

Now my SaveBitmapAsync() method looks like this:

private async void SaveBitmapAsync()
{
  List<byte> _pixelData = null;
  //convert the source bitmap to a pixel array
  var srcfile = await StorageFile.GetFileFromApplicationUriAsync(
    new Uri("ms-appx:///assets/ChichenItza.jpg"));

  BitmapDecoder decoder = null;
  using (IRandomAccessStream stm = 
    await srcfile.OpenAsync(FileAccessMode.Read))
  {
    decoder = await BitmapDecoder.CreateAsync(
      BitmapDecoder.JpegDecoderId, stm);
    _pixelData = (await decoder.GetPixelDataAsync()).
        DetachPixelData().ToList();
  }; 

  //do our pixel manipulation here
  List<Point> all = _allPoints.Select(stroke =>
  {
    var interpolationresults = stroke.SelectMany((pt, idx) =>
    {

      List<Point> result = new List<Point>();
      if (idx + 1 == stroke.Count) return result;
      Interpolate(stroke[idx], stroke[idx + 1], ref result);
      return result;
    }).ToList();
    return stroke.Concat(interpolationresults).ToList();
  }).SelectMany(list => list).ToList();



  foreach (Point p in all)
  {
    int idx = (int)(800 * ((int)p.Y - 1) + p.X) * 4;
    _pixelData[idx] = Colors.Red.B;
    _pixelData[idx + 1] = Colors.Red.G;
    _pixelData[idx + 2] = Colors.Red.R;
    _pixelData[idx + 3] = Colors.Red.A;
  }
  //write it back
  var destfile = await KnownFolders.PicturesLibrary.CreateFileAsync(
    string.Format("{0}_signature.jpg", 
    DateTimeOffset.Now.ToString("MM_dd_yy_hh_mm_ss")), 
    CreationCollisionOption.ReplaceExisting);
  using (IRandomAccessStream stm = 
    await destfile.OpenAsync(FileAccessMode.ReadWrite))
  {
    BitmapEncoder encoder = await BitmapEncoder.CreateAsync(
      BitmapEncoder.JpegEncoderId, stm);
    encoder.SetPixelData(
      decoder.BitmapPixelFormat, 
      BitmapAlphaMode.Straight, 
      800, 535, decoder.DpiX, decoder.DpiY, 
      _pixelData.ToArray());

    await encoder.FlushAsync();
  };
}

Note the portion in bold. We run a query through our collection of strokes (with each stroke being a collection of Points), and for each pair of points in each stroke, we interpolate intermediate points using the Interpolate() method. We finally create a flattened collection of all the points – including the points originally recorded, and the points we interpolated. Now for each of these Points, we find the corresponding pixel in the flattened pixel array (note how we find the index – since we know the width of the image to be 800 and each pixel is actually 4 bytes), and we change the 4 bytes representing the pixel to reflect the color Red (using a BGRA order). Here is a final saved image:

07_13_12_11_36_14_signature

Better – right ? You may still notice the stroke to be a little thinner – that is because while rendering I was using a stroke thickness of 2. If you want to implement a stroke thickness greater than 1,you can change the pixel manipulation code to color immediately neighboring pixels as well.

Note that this solution is still fairly immature and has a lot of room for improvement. The pixel manipulation and interpolation code is embarrassingly parallelizable, and should be. You might also consider implementing this in C++ (and use  C++ AMP) to not only parallelize but do so on the GPU. Also handling stroke thicknesses, higher resolution images (more processing –>more reason to parallelize) etc. can add to the complexity.

But hopefully this will give you a starting point if you face a similar problem. More next time !!

- Jit