SoC
Garrett Cooper
youshi10 at u.washington.edu
Tue May 15 05:17:37 UTC 2007
Duane Whitty wrote:
> On Sunday, 13 May 2007 at 22:39:51 -0700, Garrett Cooper wrote:
>> Duane Whitty wrote:
>>> Garrett,
>>>
>>> Sounds like you're involved in a cool project. What kind of
>>> community collaboration/involvement would be helpful to you?
>>>
>>> Once, a long, long time ago, I wrote quite a bit of bdb 1.85
>>> code. At that time it WAS the current version :) I might
>>> actually remember a bit if I start working with it again.
>>> But what would be most useful to you?
>>>
>>> And if I may ask about a design decision: Why did you choose
>>> a hash structure? Perhaps if you have time you could give
>>> a little more info but whatever fits your schedule.
>>>
>>> Good luck on your project.
>>>
>>> Duane
>> Duane,
>>
>> I actually chose hash structure at the time because I thought it was
>> appropriate for the size of the ports tree and the number of files that
>> may need to be used. Plus, Kris suggested that :). Given the way that
>> I've seen how things are used, this would be great for searching for who
>> added what file, finding cyclic dependencies easily, maintaining
>> uniqueness, etc, many common issues with the current ruby scripts.
>>
>> Also, the other available BDB options like btrees seem inefficient,
>> over the long run :(..
>>
>
> I guess frequent deletions and lack of space recovery are the problem with btrees?
Yes, that's part of it, but having to pivot the damn tree every once in
a while to get good performance is a waste of resources too. Hash tables
are of course much better at insertion / deletion as you probably well
know. Having to do (possibly) O(n) insertion and deletion with btrees
isn't good at all :(.. On hash tables its O(c) to something less than
99.9% of the time from what I know O(n) (depends on the bucket and
generation schemes of course). All that has to happen every once in a
while is the buckets capacity may increase, or the number of overall
buckets to decrease collisions, but it seems much more feasible than a
lopsided tree :).
>> Do you know of any simple APIs that can quickly dump fields in use
>> with BDB .db files? I have a hunch given the Ruby that I've taken a look at
>> with Portupgrade that something very inefficient's in play, but I want
>> to test my assumption first before jumping to conclusions.
>>
>
> I did a quick ports search and came up with databases/ruby-bdb1. I don't grok ruby
> but I've telling myself I should learn [sigh]. I don't know if this has a simple
> API or not; I'll take a look but I suspect it is probably overkill.
Ruby's nice, but it's built on Perl so I have suspicions on its overall
usability / speed given my experience with Perl over the past 4 months
daily for work :(.. Ruby's just the new big thing for programming
languages, so everyone's into it. Kind of like how Java was compared to
C/C++ a few years back. But once everything dies down people will
realize that they'll still have to program in C/C++/Perl for real-world
applications.
Python seems better than Ruby from what I can see, but I really don't
like the mandatory indentation thing. Ew..
> If this doesn't meet your project's needs I'll try coding something up in C. I
> imagine we'll need some tools written in C at some point anyhow.
That's ok. If you don't know anything right offhand that's more than a
few lines, I'll keep on hunting / prodding for documentation /
resources, and/or keep on reading the /usr/src/lib/libc/db source. I was
just looking for something that could help me get moving quickly in the
right direction.
>> Thank you very much for the help :)!
>>
>
> Well, we'll see about how much help I can be; but I'll try. It's your project
> so let me know what you need or don't need/want
Thank you again though :). Having seasoned vets help will certainly
bring about good ideas and ideologies that I may not have thought of
myself, and will no doubt prove to foster better production code than I
could do by myself.
Cheers,
-Garrett
More information about the freebsd-hackers
mailing list