References and Pointers, Part Two

References and Pointers, Part Two

Rate This
  • Comments 31

Here's a handy type I whipped up when I was translating some complex pointer-manipulation code from C to C#. It lets you make a safe "managed pointer" to the interior of an array. You get all the operations you can do on an unmanaged pointer: you can dereference it as an offset into an array, do addition and subtraction, compare two pointers for equality or inequality, and represent a null pointer. But unlike the corresponding unsafe code, this code doesn't mess up the garbage collector and will assert if you do something foolish, like try to compare two pointers that are interior to different arrays. (*) Enjoy!

internal struct ArrayPtr<T>
{
  public static ArrayPtr<T> Null { get { return default(ArrayPtr<T>); } }
  private readonly T[] source;
  private readonly int index;

  private ArrayPtr(ArrayPtr<T> old, int delta)
  {
    this.source = old.source;
    this.index = old.index + delta;
    Debug.Assert(index >= 0);
    Debug.Assert(index == 0 || this.source != null && index < this.source.Length);
  }

  public ArrayPtr(T[] source)
  {
    this.source = source;
    index = 0;
  }

  public bool IsNull()
  {
    return this.source == null;
  }

  public static bool operator <(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index < b.index;
  }
       
  public static bool operator >(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index > b.index;
  }
       
  public static bool operator <=(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index <= b.index;
  }
       
  public static bool operator >=(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index >= b.index;
  }
      
  public static int operator -(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index - b.index;
  }
       
  public static ArrayPtr<T> operator +(ArrayPtr<T> a, int count)
  {
    return new ArrayPtr<T>(a, +count);
  }
       
  public static ArrayPtr<T> operator -(ArrayPtr<T> a, int count)
  {
    return new ArrayPtr<T>(a, -count);
  }
       
  public static ArrayPtr<T> operator ++(ArrayPtr<T> a)
  {
    return a + 1;
  }
     
  public static ArrayPtr<T> operator --(ArrayPtr<T> a)
  {
    return a - 1;
  }

  public static implicit operator ArrayPtr<T>(T[] x)
  {
    return new ArrayPtr<T>(x);
  }

  public static bool operator ==(ArrayPtr<T> x, ArrayPtr<T> y)
  {
    return x.source == y.source && x.index == y.index;
  }

  public static bool operator !=(ArrayPtr<T> x, ArrayPtr<T> y)
  {
    return !(x == y);
  }

  public override bool Equals(object x)
  {
    if (x == null) return this.source == null;
    var ptr = x as ArrayPtr<T>?;
    if (!ptr.HasValue) return false;
    return this == ptr.Value;
  }

  public override int GetHashCode()
  {
    unchecked
    {
      int hash = this.source == null ? 0 : this.source.GetHashCode();
      return hash + this.index;
    }
  }

  public T this[int index]
  {
    get { return source[index + this.index]; }
    set { source[index + this.index] = value; }
  }
}

Now we can do stuff like:

double[] arr = new double[10];
var p0 = (ArrayPtr<double>)arr;
var p5 = p0 + 5;
p5[0] = 123.4; // sets arr[5] to 123.4
var p7 = p0 + 7;
int diff = p7 - p5; // 2

Pretty neat, eh?

UPDATE:

A number of people in the comments have asked why the code disallows a pointer "past the end" of the array. In fact the original C code that I was porting did use an invalid "marker" value as the "end of array" marker, and the code did thereby manipulate pointers "past the end" of the array. The original version of the ArrayPtr class that I actually used in the port to C# supported having a pointer one past the end of the array, and threw an exception if you ever tried to dereference it. I thought this detail was distracting from the point of the article so I eliminated the feature from the C# code before I posted it. Perhaps that was a premature optimization.

I have a similar C# wrapper type for strings, where again, I permit a pointer "past the end" of the string where the null-terminating character would be in a C program. That class also supports common C idioms like "strlen" and whatnot. Such types are very handy when porting C code to C#; ultimately of course it is better to use C# idioms in the long run, but in the short run it is very useful to be able to get things working quickly.

------------

(*) Were this to be a public type then I'd make the assertions into exceptions because there is no telling what crazy thing the public is going to do; since this is an internal type I can guarantee that I'm using it correctly, so I'll use an assertion instead.

  • @jader: A null Nullable<T> is converted to a Nullable<T> where HasValue = false, so the next line will simply return false in that situation.

  • I think that you should assert that the array length is not zero.

    Otherwise, such a condition just leads to all kinds of crazyness.

  • @Shuggy: It's not a bad thing because of the way the addition method is written. As written, this class would assert if you go off the end of an array, which would disallow looping through using the pointer. (Ie, it disallows some of the useful kinds of things it was written to allow)

    EG:

    int[] array = {1,2,3,4,5};

    ArrayPtr<int> ptr = array;

    while ( ptr < (some way of getting a pointer to the end) )

    {

     ptr[0] = 0;  

     ptr = ptr + 3; //Assert fires here on the second time through, even though we'll never use the invalid value

    }

    The assert should be on the indexer, not the constructor: even as written, it misses the case where you ask for ptr[1] where it already points to the end of the array, for example. (though this will already throw an exception, so it's possibly by design)

  • Why did you use Object.ReferenceEquals to compare array references? Doesn't the == operator compare arrays as references anyway?

  • Why do you use assertions instead of exceptions?

  • @configurator

    An IEnumerator<T> which starts at the position the pointer was at and goes all the way to the end of the array.

  • @Ben

    Personally I believe modifying the guard (in this case to be <= end - 2) is appropriate.

    If the guard would become very complex the working out what you were going to increment/ decrement and doing a break if it would go put of bounds is also fine.

    I admit it might make the code slightly less easy to read but it also ensures that nobody ever has access to an invalid pointer they might use

    If you cared you could easily write a SafeIncrement method which returned null (or some other sentinel) once you fell off the edge and use the sentinel in your guard instead.

  • In your example use code, shouldn't

      var p5 = arr + 5;

    be

      var p5 = p0 + 5

    ?

    Otherwise it doesn't compile.

  • @Monsignor

    did you miss the footnote?

    "(*) Were this to be a public type then I'd make the assertions into exceptions because there is no telling what crazy thing the public is going to do; since this is an internal type I can guarantee that I'm using it correctly, so I'll use an assertion instead."

  • @Shuggy

    I agree that a guard is probably a better idea, but the beauty of allowing pointers to outside the bounds is it retains the "object safety" of the class as written. (ie, you get the assert when you try to compare an invalid value of one array to a valid value of another). I'd possibly go for a hybrid, now that you mention it: have all invalid values compare equal, and disallow any operation other than comparison on them. This would allow you to use invalid values as guards, keeping the safety check of "comparing from the same array", while minimising the possiblity that you'll miss seeing an invalid value misused.

  • Nice code. A question on your programming style. Do you use assertions frequently when writing internal classes? I have heard lot of guys talking about avoid to use assertions because they clutter in the actual code. While I think assertions should be used proactively, I would like to get your view also on this.

    Thanks

  • I believe your this[] operator should check for bound issues. Maybe something like this would be enough.

     public T this[int index]

     {

       get { return index == 0 ? source[this.index] : (this + index)[0]; }

       set { if (index == 0) source[this.index] = value; else (this + index)[0] = value; }

     }

  • Your decision to safely cast 'x' as a Nullble<ArrayPtr<T>> in Equals() threw me for a moment, as 'x' (typed as 'object') could not possibly be a Nullable<> (nullables lose their identity as nullables when boxed).  But this technique lets you achieve with a single 'as' cast what would otherwise require both an 'is' check and an unsafe cast.  Very clever.

  • In C and C++, taking the difference between two pointers into the same object is well-defined, whether or not that object is an array.  Is it safe to assume that there is no library implementation which would provide equivalent functionality in C#?

  • @Ben: Not safely or with particularly good performance, but you could do something like this: http://pastie.org/1706405

    Please do not use that technique for anything other than academic purposes.  It really is not a good idea.

Page 2 of 3 (31 items) 123