SoC

Tue May 15 05:17:37 UTC 2007

Duane Whitty wrote:
> On Sunday, 13 May 2007 at 22:39:51 -0700, Garrett Cooper wrote:
>> Duane Whitty wrote:
>>> Garrett,
>>>
>>> Sounds like you're involved in a cool project.  What kind of
>>> community collaboration/involvement would be helpful to you?
>>>
>>> Once, a long, long time ago, I wrote quite a bit of bdb 1.85
>>> code.  At that time it WAS the current version :)  I might
>>> actually remember a bit if I start working with it again.
>>> But what would be most useful to you?
>>>
>>> And if I may ask about a design decision: Why did you choose
>>> a hash structure?  Perhaps if you have time you could give
>>> a little more info but whatever fits your schedule.
>>>
>>> Good luck on your project.
>>>
>>> Duane
>> Duane,
>>
>> 	I actually chose hash structure at the time because I thought it was 
>> appropriate for the size of the ports tree and the number of files that 
>> may need to be used. Plus, Kris suggested that :). Given the way that 
>> I've seen how things are used, this would be great for searching for who 
>> added what file, finding cyclic dependencies easily, maintaining 
>> uniqueness, etc, many common issues with the current ruby scripts.
>>
>> 	Also, the other available BDB options like btrees seem inefficient, 
>> over the long run :(..
>>
> 
> I guess frequent deletions and lack of space recovery are the problem with btrees?

Yes, that's part of it, but having to pivot the damn tree every once in 
a while to get good performance is a waste of resources too. Hash tables 
are of course much better at insertion / deletion as you probably well 
know. Having to do (possibly) O(n) insertion and deletion with btrees 
isn't good at all :(.. On hash tables its O(c) to something less than 
99.9% of the time from what I know O(n) (depends on the bucket and 
generation schemes of course). All that has to happen every once in a 
while is the buckets capacity may increase, or the number of overall 
buckets to decrease collisions, but it seems much more feasible than a 
lopsided tree :).

>> 	Do you know of any simple APIs that can quickly dump fields in use 
>> 	with BDB .db files? I have a hunch given the Ruby that I've taken a look at 
>> with Portupgrade that something very inefficient's in play, but I want 
>> to test my assumption first before jumping to conclusions.
>>
> 
> I did a quick ports search and came up with databases/ruby-bdb1.  I don't grok ruby
> but I've telling myself I should learn [sigh].  I don't know if this has a simple
> API or not; I'll take a look but I suspect it is probably overkill.

Ruby's nice, but it's built on Perl so I have suspicions on its overall 
usability / speed given my experience with Perl over the past 4 months 
daily for work :(.. Ruby's just the new big thing for programming 
languages, so everyone's into it. Kind of like how Java was compared to 
C/C++ a few years back. But once everything dies down people will 
realize that they'll still have to program in C/C++/Perl for real-world 
applications.

Python seems better than Ruby from what I can see, but I really don't 
like the mandatory indentation thing. Ew..

> If this doesn't meet your project's needs I'll try coding something up in C.  I
> imagine we'll need some tools written in C at some point anyhow.

That's ok. If you don't know anything right offhand that's more than a 
few lines, I'll keep on hunting / prodding for documentation / 
resources, and/or keep on reading the /usr/src/lib/libc/db source. I was 
just looking for something that could help me get moving quickly in the 
right direction.

>> 	Thank you very much for the help :)!
>>
> 
> Well, we'll see about how much help I can be; but I'll try.  It's your project
> so let me know what you need or don't need/want

Thank you again though :). Having seasoned vets help will certainly 
bring about good ideas and ideologies that I may not have thought of 
myself, and will no doubt prove to foster better production code than I 
could do by myself.

Cheers,
-Garrett