Wednesday, September 14, 2011

C# is a very type-safe language

In C#, type-safety is not only a compile-time thing, it's also type-safe during runtime, this will not compile:

Student s = new Person();

This is not allowed in C++ too, C++ is type-safe too, albeit not as strongly type as C#, which we will discuss later what makes C++ type-safety a bit weaker, this will not compile too:


Student *s = new Person();


Now, let's try to coerce C# to point Student object to Person object, this will compile:

Student s = (Student) new Person();

Let's try the same with C++, this will compile too:

Student *s = (Student *) new Person();



The two codes above, though both will compile, that is where C# and C++ diverge, not only in terms of type safety but also in terms of runtime security(which we will discuss immediately). During runtime, C# will raise an error; whereas C++ will happily do your bidding, it will not raise any error. Now, how dangerous is that kind of code?


Imagine this is your object:

class Person
{
 public int favoriteNumber;
}

class Student : Person
{
 public int idNumber;
}



If you allocate memories for Person, for example, if it so happen that the locations of these memory allocations are adjacent to each other: Person pA = new Person(); Person pB = new Person(); If the runtime will let us point Student s to Person pA, there's a higher chance that the assignments on fields of Student s could overwrite the content of adjacent Person pB object.



To make the principle more concrete, let's do it in a language which leniently allow that sort of thing.

#include <cstdio>


class Person
{
public:
   int favoriteNumber;

};


class Student : public Person
{
public:
 int idNumber;
};


int main()
{
 
 Person *p = new Person[2];
 p[0].favoriteNumber = 7;
 p[1].favoriteNumber = 6;

 printf("\nP1 Fave# %d", p[0].favoriteNumber);
 printf("\nP2 Fave# %d", p[1].favoriteNumber);


 void *objek = (void *) &p[0];
 // above code is equivalent to C#'s:
 // object objek = p[0];
 
 Student *s = (Student *) objek;
 // above code is equivalent to C#'s:
 // Student s = (Student) objek;
 
 
 s->idNumber = 9;  
 printf("\n\n");
 printf("\nS# %d", s->idNumber);
 printf("\nP1 Fave# %d", p[0].favoriteNumber);
 printf("\nP2 Fave# %d", p[1].favoriteNumber);


 p[1].favoriteNumber = 8;
 printf("\n\n");
 printf("\nS# %d", s->idNumber);
 printf("\n\n");
  
}


The output of that code:


P1 Fave# 7
 P2 Fave# 6
 
 
 S# 9
 P1 Fave# 7
 P2 Fave# 9
 
 
 S# 8



Can you spot the anomalies? We assign a value 9 to student's idNumber, yet P2's favoriteNumber also changes. We changed the P2's favoriteNumber, yet student's idNumber also changes. Simply put, Student's field(s) overlap other objects' field(s) location , so that's the problem if a language allows arbitrary pointing of objects to any object type.


.........Person
[0xCAFE] (favoriteNumber) : 7
[0xCAFF] (favoriteNumber) : 6


Student points to first person(which has an adjacent person):

.........Person                Student
[0xCAFE] (favoriteNumber) : 7  (favoriteNumber)
[0xCAFF] (favoriteNumber) : 6  (idNumber)


If a language allows pointing Student to a Person's memory location(0xCAFE), what will happen if we change the value of Student's idNumber? the adjacent second person's favoriteNumber will be changed too, an unintentional corruption of memory. Worse yet, if this is done on purpose, it is a potential security problem. Think if Student can point to any object type, the idNumber could be used to peek other object's contents, even those other object's fields are private

No comments:

Post a Comment