BSDStats - What is involved ... ?
Brooks Davis
brooks at one-eyed-alien.net
Tue Aug 29 01:53:34 UTC 2006
On Tue, Aug 29, 2006 at 11:39:23AM +1000, Antony Mawer wrote:
> On 29/08/2006 6:07 AM, Marc G. Fournier wrote:
> >On Mon, 28 Aug 2006, Brooks Davis wrote:
> >
> >>While I understand (or think I understand) the motivations for this
> >>design goal, it's contrary to allowing collection of statistics from
> >>many people. I'd love to be able to publish data from the FreeBSD
> >>systems (300+) at work, but unless I can do it in an anonymized
> >>aggregate form it's not going to happen. I just can't justify leaking
> >>that much internal configuration information given a policy of hiding
> >>it (right or wrong and not subject to debate). If I could run my own
> >>stats server and publish from it that might be possible.
> >
> >Agreggate submissions will never be possible, as it will definitely
> >break any attempts at keeping the data 'clean' :( I do understand that
> >we will never be able to get *everyone* reporting, but we will try as
> >much as possible to make it easy for as many as possible to report
> >*within* limits ...
> >
> >I'm going to work on an 'email submission' method in September, that
> >would allow repoting to go *thru* one mailbox, and will include a
> >confirmation/challenge stage *per* server though ...
>
> Brooks, what sort of information are you looking to "anonymise" before
> sending it out? Aggregating to say that I have X of this kind of CPU, Y
> of this IDE chipset, etc, rather than linking it specifically to each
> machine? Where would you feel a comfortable balance lay? Obviously some
> effort needs to be made to minimise fraudulent entries
>
> Perhaps aggregate submissions could be conducted using a registration
> mechanism...
>
> Other thoughts would be having a local stats aggregation server that
> pushes summaries up to the master server... the aggregation server keeps
> the individual details, and some sort of challenge mechanism could be
> randomly selected by the master server to reduce the ease with which the
> numbers can be 'faked'?
>
> ... just rambling as I thought of potential ways around this ...
I'd prefer not to expose host names or IP addresses, hardware
information and OS version aren't really a problem if they can't be
traced to a host name. The requirement to register an aggregation
server would be fine with me. A challenge mechanism would be tricky
because it would have to occur during a push to the central server since
connects back are not really possible.
-- Brooks
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20060829/c2c52e94/attachment.pgp
More information about the freebsd-arch
mailing list