Tuesday, December 14, 2010

Multi Key Maps.. myth or reality


What the heck ? Never heard of these, is it possible to access the same entry with multiple keys? Just somedays back thought of this, and said y not .

Our good old friend Map allows us to have an unique key which is mapped to a value. To access a value we must have an index which is unique. But what if we want to access the same entry but using multiple indices we need a slightly different mechanism.

Ok , so here I would discuss a way to achieve it. And we will put into use something we learned in school and that is Prime Number. These are basically the numbers that are divisible by themselves except 1.

Euclid’s Fundamental Theorem of Arithmetic states that Every integer can be written as a product of primes in an essentially unique way. And Euclid's second theorem states that the number of prime numbers is infinite.

Now, as per the first rule we conclude that prime numbers are the ones upon multiplication of which we can generate any number. So, can we say they are the basic components of any number.

Based on this we try to create a key of a map. Lets say we want to create a map in which a value can be accessed using 2 keys.
Here are the various keys that we are going to use and the index values they generate.
2*3 = 6
5*7 = 35
11*13 = 143 .. and so on. Note that each of the index has only 2 factors except 1.

Now we create a map from integer to a string with the above indices
6 --> “Hello”
35 --> “ World”
143 ---> “Sanjiv”

Say in c++, the search code will look something like this.

map my_map.begin ;
string get(int key)
{
for(map::iterator iter = my_map.begin(); iter != my_map.end(); iter++)

{

if(iter->first % key == 0) //Key matches with the index

return iter->second;

}

}

It looks so dumb that we are have to traverse the whole map to find the match. This completely denies the whole purpose of an associated storage.

I hope I will be able to come up with a better way to do the same.

Friday, June 11, 2010

What the Hell are Allocators ???

Writing this is like a compulsive need to understand wht the heck is all about Allocators in c++. I see these whenever I check a process core dump with vectors, maps etc , and they make the complete stack look so scary. And also not to mention the compilation errors. You do some teeny weeny mistake in vector or map or some crazy stl declaration and bammm the compilation error scares the hell out of you, and you wonder where did I declare with those allocators n all.

Here I will try to get over my fears and try to understand properly what are these things. And not embarrass my self because of the fact that being a 3.5 years experienced c++ programmer c++ allocators have eluded me for a long, and will now make an attempt to do it along with explaining to you guys.

The definition as understood by me is that an allocator is a class that handles the allocation/de-allocation of various STL, especially containers. For example take vector, its just a linked list. Whenever we call push_back() on a vector object, it has to allocate some memory for the entry and attache it to the existing linked list. This allocation of memory is hidden to us and is handled by a class specifically designed for the purpose i.e  the allocator.
Now what is the functionality  that we are looking for in the Allocator class
1. Allocation/Deallocation
2. Hopefully Thats it...

Lets take it forward and try to implement an allocator.

#include <iostream>

#include <vector>



using namespace std;



template<typename T>

class Allocator

{

        public:

                typedef T value_type;

                typedef value_type* pointer;





        public:

                inline explicit Allocator()

                {

                        cout << "Constructor of Allocator\n";

                }

                inline ~Allocator()

                {

                        cout << "destructor of Allocator\n";

                }

                inline explicit Allocator(Allocator const&) {}



};



int main()

{

        vector<int, Allocator<int> > vec;

}


Guess what when I compiled I got a really long list of errors. Probably these compilation errors will help us understand Allocators better. So, here is the listing.

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h: In instantiation of `std::vector<int, Allocator<int> >':

test_allocator.cxx:29:   instantiated from here

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:152: error: no type named `const_pointer' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:153: error: no type named `reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:154: error: no type named `const_reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:157: error: no type named `const_pointer' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:158: error: no type named `const_pointer' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:322: error: no type named `const_pointer' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:338: error: no type named `const_pointer' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:354: error: no type named `const_pointer' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:370: error: no type named `const_pointer' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:462: error: no type named `reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:476: error: no type named `const_reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:500: error: no type named `reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:514: error: no type named `const_reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:521: error: no type named `reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:528: error: no type named `const_reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:535: error: no type named `reference' in `class Allocator<int>'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:542: error: no type named `const_reference' in `class Allocator<int>'

test_allocator.cxx: In function `int main()':

test_allocator.cxx:29: error: no matching function for call to `Allocator<int>::Allocator(Allocator<int>)'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:182: error: in passing argument 1 of `std::vector<_Tp, _Alloc>::vector(const typename std::_Vector_base<_Tp, _Alloc>::allocator_type&) [with _Tp = int, _Alloc = Allocator<int>]'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h: In member function `void std::_Vector_base<_Tp, _Alloc>::_M_deallocate(_Tp*, size_t) [with _Tp = int, _Alloc = Allocator<int>]':

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:106:   instantiated from `std::_Vector_base<_Tp, _Alloc>::~_Vector_base() [with _Tp = int, _Alloc = Allocator<int>]'

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:182:   instantiated from `std::vector<_Tp, _Alloc>::vector(const typename std::_Vector_base<_Tp, _Alloc>::allocator_type&) [with _Tp = int, _Alloc = Allocator<int>]'

test_allocator.cxx:29:   instantiated from here

/usr/local/lib/gcc/sparc-sun-solaris2.10/3.4.6/../../../../include/c++/3.4.6/bits/stl_vector.h:117: error: 'struct std::_Vector_base<int, Allocator<int> >::_Vector_impl' has no member named 'deallocate'



So, you can see there are various declarations/definitions missing in the allocator class. I hate to tell this, but its a fact that allocators follow some C++ standards guideline as per how the class should look like. 
(ISO C++ standard, section 20.4.1)
And it looks like the following. I have provided brief description next to each declaration.

//Template definition for allocator as it will be used to allocate 
// any pointer type
    template <class T>  class allocator {
    public:
            //Set of typedefs to be used in various

            //methods for allocation/deallocation/construction

            // destruction etc.


            typedef size_t    size_type;

            //Unsigned integral value that can represent the difference between two pointers
            typedef ptrdiff_t difference_type;

            // Pointer type for the class

            typedef T*        pointer;

            // Const pointer type for the class
            typedef const T*  const_pointer;

            //Reference type for the class 
            typedef T&        reference;

            //Const Reference for the class
            typedef const T&  const_reference;

            // Value type    
            typedef T         value_type;
      
            //template structure, which allows any allocator might allocate memory of another type indirectly.

            template <class U> struct rebind { typedef allocator<U>
                                         other; };

            allocator() throw();
            allocator(const allocator&) throw();
            template <class U> allocator(const allocator<U>&) throw();
            ~allocator() throw();

            pointer address(reference x) const;
            const_pointer address(const_reference x) const;

            // This function will be called, whenever a new memory 
            // allocation is required. As in vector push_back      
            pointer allocate(size_type,
                       allocator<void>::const_pointer hint = 0);

            // This function will be called when the memory location needs to be 

            // freed.

            void deallocate(pointer p, size_type n);
            size_type max_size() const throw();

            // Following are required to initialize/de-initialize the allocated
            // memory location.
            void construct(pointer p, const T& val);
            void destroy(pointer p);
    };


Now after we have understood how our Allocator class should look like, lets put some life into this class. Lets make it usable by defining atleast allocate/deallocate functions. 

Here is how I have defined it in my test program.
            pointer allocator::allocate(size_type sz, std::allocator<void>::const_pointer hint = 0)
            {
                     cout << "Allocating\n";
                     char* ch = (char*)malloc(sizeof(sz));
                     return (pointer)(ch);
            }
            void allocator::deallocate(pointer p, size_type n)
            {
                    cout << "De-Allocating\n";
                    free(p);
            

            }


Now lets see, how does it work in real use.

            int main()
            {
                vector<int> myVector;


                myVector.push_back(1);

  

            }

On compiling and running the program following out put is obtained.
            Allocating

            De-Allocating


At this point atleast we are aware of some internals of allocators, and we know "wht the hell allocators are" . Going further we can define our pool of memory and allocate from that pool and handle all allocations/deallocations in that chunk of memory using our own creative memory management algorithm. 

I hope I will cover that advanced topic some other day , till then cya .....

Thursday, February 4, 2010

Why Leaking Memory ?

This is a common practice programmers have , including myself not to check for any growth in memory of programs written unless someone asks to.

Here is a definition of Memory Leak.

Memory leak is what happens when a program is unable to release memory allocated by it, back to the OS/kernel.

In simple terms I have borrowed something from a friend of mine and I lost it.Now I cannot return it to him. No, I am not Amir Khan from Ghajni, I still remember who that friend of mine is . Just that I am so much careless that I do not take care of stuffs borrowed from friends.

Now this inability of the program to release the memory back to the kernel, in a long run causes the system memory to be full and hence less memory availability for other process to run hence decrease in system performance and consequently bad days for the developer who did not fix the leak at the right time and so on.Hopefully I am still doing fine after ignoring so many memory leaks :-

So, to avoid all that all we have to do is not waste time on facebook/orkut in office time, not spend time at tea or lunch discussing Kat-Salman relationship , not spend time on cracking stupid PJs at your workplace, but the easiest one will be to read further and get enlightened.

Here I have tried to come up with a list of reasons of Memory Leaks and the possible solution for the same apart from not being drunk while writing a piece of code.

A very simple memory leak can be introduced as follows.
int main()
{

char* ch = new char(100);

//Do not free the pointer ch, using
//delete ch;
//and you get a memory leak

}

This looks harmless but if we put the allocation in a while or a for loop to run indefinitely, it make matters worse.

Here are few scenario I could come up with in which we can get memory leak.

1. Assigning a pointer NULL after allocating memory.
{
char *ch = new char(10);
ch = NULL;
}

Solution is to free the memory allocated before assigning NULL to the pointer.

2. Freeing the memory using a pointer of different data type. Sounds rare but its possible.
{
char* ch = new char(100);
int *num = (int*)ch;
delete num; //Just frees sizeof(int) bytes
}

Solution is to delete the pointer of same type which was used while memory allocation.

3. In case of arrays missing [] with delete operator.
{
char* ch = new char[100];
delete ch; //will free just 1byte
}


Solution is to use [] while freeing memory allocated for arrays.
{
char *ch = new char[100];
delete [] ch; // or delete ch[]
}

4. STL containers do not call delete if pointers are stored. For example
a.
{

vector<char*> list;
list.push_back(new car(10));
list.push_back(new car(100));
list.clear();//Clearing pointers, not freeing memory
}
b.
{
map<int, char*> myMap;
myMap[1] = new char(10);
myMap.clear();//Clearing pointers, not freeing memory
}

Solution will be traverse the map or vector or any such container and free the allocated memory individually and then clear the container.

5. Sometimes it happens that a function allocates and returns the pointer to the calling code, but the calling code does not free the pointer.
a.
void fun(char* * ch)
{
*ch = new char(10);
}
b.
char* fun()
{
char *ch = new char(10);
return ch;
}
c.
There are APIs provided by libraries, that provide documentation about the handling of the pointer deletion.

Solution is to free the pointer responsibly and follow the API documentation.

6. Forgetting to delete the pointer member variable in the destructor.
class A
{
char* ch;
public:
A()
{
ch = new char(100);
}
~A()
{
//ch not deleted, memory leak
//delete ch;
}
}

Solution is to free the memory allocated as follows in the destructor
~A()
{
if(ch != NULL)
{
delete ch;
}

}

7. Mistakenly Overwriting the pointer.This can happen in following possible ways.
a.
char *ch = new char(100);
char *ch1 = new char(100);
ch1 = ch;// We lost memory allocated by ch1

b.
vector<char*> list;
list.push_back(new char(10));
list[0] = new char(100);//Overwriting the first pointer

c.
map<int, char*> myMap;
myMap[1] = new char(10);
myMap[1] = new char(100);//Overwriting the pointer with key 1

The solution will be to be careful while assigning a pointer to other. It is advised to make sure that the pointer on left side is NULL.
{
char *ch = new char(100);
char *ch1 = new char(200);
if(ch == NULL)
{
ch = ch1;
}

}

This is applicable to vectors/list/Maps or such STL containers also.

8. In c++ deleting a pointer of base class type when its pointing to a child class object, can cause memory leak if the base class destructor is not declared virtual.

class A
{
public:
A()
{ }

~A()
{
cout << "Destroy A" << endl;
}
};

class B : public A
{
private:
char *ch;
public:
B()
{
ch = new char(100);
}
~B()
{
delete ch;
cout << "Destroy B" << endl;
}
};
int main()
{
A* a = new B();
delete a;
}

The Solution is to declare the ~A() as virtual as follows
virtual ~A(){ }
This makes sure that the destructor of child class is called before the parent class destructor.

So, this is the list that I could come up with. Some of the examples are very trivial, but in a real time code the same scenario occurs in a different way, may be spanning some functions/classes of files. So, its quite difficult to find some of those.

But following of few prgramming practices can minimize the effort to debug the memory leaks.One of them is make sure pointer is NULL before allocating and explicitly assign that pointer to NULL after calling delete. The second one is because delete does not assign the pointer NULL after freeing the allocated memory.
if(pointer == NULL)
{
//Allocate memory using the pointer
}
delete pointer;
pointer = NULL;

Another thing is to keep in mind while deleting a pointer is to check if its not NULL of already freed.This should be done to get rid of any double-delete issues.
if(pointer != NULL)
{
delete pointer;
pointer = NULL;
}

Wishing all of you programmers a best of luck in making your code Mem Leak free. Please provide comments or suggestions if any. Or please add any scenario if I have missed. Hope you will be able to spend time on facebook or orkut as much as possible but make sure your manager is not aware that you are freeeeee.

cya !!!

Wednesday, January 20, 2010

Object Serializing in C++

Why till date we do not have any standard way serialize/deserialize C++ like Java.This is some of the disadvantages C++ have apart from Reflection(as in Java).So I had to find a crude way of doing that which most of you will not like.

Part Zero

This technique is a cliche in C++ object serializing topic.Implement a seialize() function in the class with putting of attributes to the ostringstream.For deserializing read from istringstream and update the variables.

Example:

Part1

memcpy is faster than istringstream read or ostringstream write.
Why dont we have functions written on top of memcpy to provide the interface to serialize/deserialize various variables?
say serializeInt(int i) or serializeBool(bool b) .. etc as follows.
string serializeInt(int i)
{
string p = "1111";
char* ch = (char*)p.c_str();
memcpy(ch, &i, sizeof( i ));
string temp;
temp.assign(ch, sizeof( i ));
return temp;
}

So we use it straight away as

string serialize()
{
_string str;
str.append(serializeInt(10));
str.append(serializeBool(true);
.....and so on
}

Deserialize a bit crooked and looks like this

int deserializeInt(char** ch)
{
int num;
memcpy(&num, *ch, sizeof(num));
*ch += sizeof(num);
return num;
}

So we need to pass the address of string to the deserialize function which will after getting the value modify it to the next item to be deserialized.
So, the deserialize function of the classes look as follows.
void deserialize(string str)
{
char* ch = (char*)str.c_str();
int num = deserializeInt(&ch);
bool f = deserializeBool(&ch);
string str1 = deserializeString(&ch);
}

Part2

Now we need to come up with one class that can handle serializing of any class.Lets name it Serializer.Serializer needs to know the type and value of the attribute that it has to serialize.So, it has a multimap as a member variable. Why Multimap and not Map, because a class can have two integers as attributes or more than one float attribute.Those cannot be entered in map as the key i.e type is not unique. Serializer class also provides a function to insert the type and value to the map.

Now how should the value be entered ??As we are using the multimap we cannot have different types for the value. So, we define the map with a pointer to char as multimap. Following is what the class looks like

class Serializer
{
private:
multimap<string,char> m_typeMap;
public:
void insert(string type, char* ptr)
{
m_typeMap.insert(pair<string,char*>(type, ptr));
}

string serialize()
{
ostringstream str;
char *ptr = str.c_str();
for(multimap<string,char*>::iterator iter = m_typeMap.begin(); iter != m_typeMap.end(); iter++)
{
if(iter->first == "int")
{
int val = *(int*)iter->second;
str.write(reinterpret_cast<char*>(&val),sizeof(val));
}
}
}

void deserialize(string str)
{
//Deserialize and assign the value to the address Here
istringstream str(str);
for(multimap<string,char*>::iterator iter = m_typeMap.begin(); iter != m_typeMap.end(); iter++)
{
if(iter->first == "int")
{
int val;
str.read(reinterpret_cast<char*>(&val), sizeof(val));
*(int*)iter->second = val
}
}
}

};

Now the Serializer class is having code to serialize integer data type only. This can be extended to support any data type.Now lets see how can we use this class.

Following is the class A, which needs to be serialized.We inherit this class from our Serializer class and register the variables that we want to serialize.

class A : public Serializer
{
private:
int i;
public:
A()
{
insert("int", (char*)&i);
}
int getVal()
{
return i;
}
void setVal(int j)
{
i = j;
}
};

Now lets use this class in our main executable.

int main()
{
A obj;
obj.setVal(10);
string str = obj.serialize();

B obj1;
obj1.deserialize();
cout<<obj1.getValue();}

Hope this helps in making serialize/deserialize of c++ objects in a more generic way instead of putting the serialize/deserialize functions in each n every class.

More solutions are welcome.