c question

Fri Apr 23 18:30:45 UTC 2010

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Joerg Sonnenberger wrote:
> On Fri, Apr 23, 2010 at 06:18:46PM +0300, Eitan Adler wrote:
>>> - use a matrix is faster than use a linked list?
>> For what?
>> For insertion and deletion no - linked list is faster. For sequential
>> access they are the same speed (forgetting look-ahead caching). For
>> random access matrix is faster.
>
> Actually -- it depends. Removing the tail and inserting at tail is
> amortised constant time for arrays if done using the double-on-full
> trick. In that case, array can be the faster datastructure too.
>
Hey

I just follow this list and feel as if i should chuck in my $0.02.
Well choosing a data structure for something is really specific to
what your going to do. So first off think how you want to handle your
data, don't just choose a structure because your know it, think what
would be the ideal way to access or handle your data and create your
data structure from there.  And then you need to think about its
implemented so for example take a look at the linkedlist:

struct list_node {
    void * data;
    struct list_node * next;
}

With this struct you could then maintain your self pointers to the
head and tail easily. First of, ok adding nodes to this data structure
seems like it would be a very trivial operation, you have a pointer to
the tail then just create a new node. But searching for particular
index in the list you will have to iterate from the head and this will
not scale if you only have 10 nodes well thats fine but if you have
1000 your going to have to wait for it, but that is not to say this is
a bad structure you could use this is you know your only going to have
a few nodes then why not simply use it.

A more preferred C style way would be:

struct list {
    void ** array;
    unsigned int size, length;
}

So it becomes an array based implementation for a trade off in having
to have a threshold in expanding the array of pointers to data you get
overall better performance, since accessing is simply like accessing
an array. Where the length is the length of the number of elements in
the array and size is the size of the array so you know when your
going to overflow and need to realloc, but you could be clever and
have a threshold allocation where if your getting close to fill up you
expand 2 fold or something but thats quite naive, but you get the idea.

So ok thats the same data structure 2 different ways, but it doesn't
stop there, you could create your self dictionaries or hash tables, i
like hash tables a lot because although they can be memory hogs
depending on how its implemented its very scale able in searching for
your data and adding more data. You just need to spend time finding a
useful hashing function.

- --Phil
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkvR4W8ACgkQAhcOgIaQQ2GnmACfaEhIuptYQDfJ80NdxCjEsy6C
OOkAn2LSl9n+MxnUT144IXHSP7HcvwJj
=+1kB
-----END PGP SIGNATURE-----