Thursday, May 10, 2012

Generics object-orientation. Untyped generic is the key to generic's OOPness

Suppose you have these classes:


class Animal {
}

class Dog : Animal {
}

class Plant {

}


We knew that...
...
{
 // these works:
 MakeSound(new Animal());
 MakeSound(new Dog());
 
 // and this doesn't:
 MakeSound(new Plant());   
}


public static void MakeSound(Animal a) {
}





Then suppose we have this existing code:

public static void AddAnimal(IList<Animal> aList) {
 foreach(Animal a in aList) {
 }
 
 aList.Add(new Animal());
}


And we want that function to be instantly accessible to all Animal's derived type. That is, we want the IList<Dog> be accepted on that function too.


That is not possible, and if that could be possible, it will be dangerous, which we shall discover later on. So this will fail:

IList<Dog> dogs = new List<Dog>();
AddAnimal(dogs);

Produces this compile-time error:


cannot convert `System.Collections.Generic.IList<Dog>' expression to type `System.Collections.Generic.IList<Animal>'


For an AddAnimal to accept other types, we follow this pattern:

public static void AddAnimal<T>(IList<T> aList) where T : new() {
 foreach(Animal a in aList) {
 }
 
 aList.Add(new T());
}  

Using that function, the IList<Dog>'s Dog can be slotted on untyped T, hence the compiler allowing us to pass the dogs of type IList<T> to that function. You need to put new() on function declaration if you intend to create an object out of T. So this will work now:


IList<Dog> dogs = new List<Dog>();
AddAnimal(dogs);

And you could do this as well:


IList<Plant> plants = new List<Plant>();
AddAnimal(plants);

Oops! Any discerning object-oriented programmers worth his salt, could quickly discern that the above-code is not object-oriented, plant did not derive from Animal, AddAnimal should accept Animal only. To do that, simply put a constraint on the accepted types on the generic's parameter. We just put a where T : BaseType where the BaseType here is the Animal class

public static void AddAnimal<T>(IList<T> aList) where T : Animal, new() {
  foreach(Animal a in aList) {
  }
  
  aList.Add(new T());
}  

This will not work anymore:

IList<Plant> plants = new List<Plant>();
AddAnimal(plants);

Its compilation error:
Plant' cannot be used as type parameter `T' in the generic type or method `TestGenCompat.MainClass.AddAnimal<T>(System.Collections.Generic.IList<T>)'. There is no implicit reference conversion from `Plant' to `Animal'


To recap, these should work:

IList<Animal> anims = new List<Animal>();
AddAnimal(anims);

IList<Dog> dogs = new List<Dog>();
AddAnimal(dogs);

Now let's explore again the old code, I mentioned that it's dangerous if it's possible to pass dogs to this method:

public static void AddAnimal(IList<Animal> aList) {
 foreach(Animal a in aList) {
 }
 
 aList.Add(new Animal());
}



What will happen if they allowed passing derived types to that method? Let's simulate if that is allowed in the first place.

public static void AddAnimal<T>(IList<T> xList) where T : Animal, new() {
 IList<Animal> aList = (IList<Animal>) xList;  

 foreach(Animal a in aList) {
 }
 
 aList.Add(new Animal());
}

But alas, C#'s generic carries the type it is genericizing. Though our casting of IList<T> to IList<Animal> is allowed, during runtime it is checked if the passed variable's type signature matches the type we are casting to. So if we pass an instance of IList<Dog>, that would result to casting error during runtime.


So to simulate the inherent danger if a given language allows us to merely use the untyped generic, let's look at other languages, let's choose choose Java.

First we already knew that this is not valid and can be caught during compile-time, same with C# :

List<Dog> dogs = new ArrayList<Dog>();
List<Animal> anims = (List<Animal>)dogs;

Now let's turn to Java's method that is constrained on Animal type. Then we try to cast it:

public static <T extends Animal> void addAnimal(List<T> aList)
  throws InstantiationException, IllegalAccessException
{
 // On Java, not exactly equal generic types can't be caught during runtime.
 // C# can
 List<Animal> list = (List<Animal>) aList;

 for(Animal x : list) {
 }
 
 list.add(new Animal());
}


Now let's iterate the list after we passed it to that function:

{
 List<Dog> dogs = new ArrayList<Dog>();
 addAnimal(dogs);
 addAnimal(dogs);
 System.out.println("See " + dogs.size());

 for(Animal x : dogs ) {
  System.out.println(x);
 }
}

That code prints 2. The problem is in the for loop.

Exception in thread "main" java.lang.ClassCastException: Animal cannot be cast to Dog

Though the content of the dogs collection are two Animals, and is compatible to Animal x. The for loop don't even reach that part(Animal x) of the loop. The mere act of extracting an object from dogs' iterator is actually doing these steps:


Dog d = dogs.get(0); 
Animal x = d; 

The second line is perfectly fine. However, the first line has the problem, or rather the object in the collection is the root cause of the problem, if the Animal was not possible to be added in dogs collections, we will not be receiving any casting exception, as all dogs' elements are Dog.


So while a Dog Is-An Animal:

Dog x = new Dog();
Animal y = x;

An Animal Is-Not-A Dog, hence this would result to casting exception:

Animal a = new Animal(); // think of this as dogs.get(0)
Dog b = a; // casting exception
Animal x = b; // no error

With type erasure, this code:

public static <T extends Animal> void addAnimal(List<T> aList)
  throws InstantiationException, IllegalAccessException
{
 // Not exactly equal generic can't be caught during runtime
 List<Animal> list = (List<Animal>) aList;
}

Is actually compiled to JVM like this:

public static void addAnimal(List aList) {
   List list = aList;
   
   list.add(new Animal());
}


So that's it, in Java it's not entirely feasible during runtime that adding an Animal to a List<Dog> type can be prevented. And the consequence is, when we ultimately needed to unbox the object out of that list to its proper type, it will cause a casting exception. C# generics can prevent that scenario, as its generics carry the type; Java's generics erases the type, its generics merely shift the burden of castings away from the programmer. Behind the scenes(in JVM level), Java generics are untyped objects and are merely cast back when accessing the object.


So there goes the rationale of not allowing OOP on typed generics on function. And it requires type erasure on generic's parameter, of which C# is not designed to be.


To summarize, untyped generics coupled with type constraining (via where T : typehere) is the only way to achieve OOP nirvana on generics

No comments:

Post a Comment