Feedback for performance tracker
Kris Kennaway
kris at obsecurity.org
Wed Aug 15 12:18:52 PDT 2007
On Wed, Aug 15, 2007 at 11:04:29AM +0200, Erik Cederstrand wrote:
> Hi!
>
> This autumn, we have decided to grab the Performance Tracker entry[1]
> from the project ideas page and give it a spin as a subject for our
> thesis at the IT University of Copenhagen. The tracker intends to fill a
> hole in the range of tinderboxes and automatic stress/regression tests
> that FreeBSD already has.
>
> The initial idea is to have a small collection of servers constantly
> performing benchmarks and publishing the results to a server with a web
> interface.
>
> Before we start coding, we'd like to ask a couple of questions:
>
> 1) Which benchmarks would you like to see being run?
> 2) Which tests do you perform regularly, which the tracker could automate?
> 3) Which features in the web interface would you find most helpful?
>
> Also, we'd greatly appreciate pointers to previous work in the area.
>
> We welcome all comments and suggestions, but please bear in mind that we
> only have around 3 months full-time to develop the tracker.
Hi,
Thanks for your interest in the project. I have some recommendations
for how to approach it:
* Don't focus on the individual benchmarks, instead on the framework
for accumulating and analysing the data. There are lots of
benchmarks we may want to plug into this over time, so developing a
flexible and extensible system for doing this is more important than
any given benchmark.
* I imagine a system where data from benchmark systems (which will be
geographically remote) is fed into a database that tracks multiple
data sets over time. A front end would provide an interface into
this database and allow for various analyses and visualizations of
the data
* The system should allow for annotation of data, for example to
provide explanations for sudden jumps in performance when they are
understood.
* Data sets may be multi-dimensional (e.g. tracking a performance
metric like network throughput as various parameters like packet
size, number of concurrent streams, etc, are changed). In most
cases we are also interested in changes over time.
* There may be parametric and non-parametric variables. An example of
a parametric variable would be "size of a network packet" (i.e. a
numerical parameter which takes values over some range). A
non-parametric variable might be "kernel built with option X, or
option Y, or option Z". It makes sense to visualize parametrized
data as a continuous function, e.g. by plotting it as a continuous
function on a graph, or fitting the data to a function. It makes
less sense to treat non-parametric data as a continuous function.
* Data sets are typically noisy. They need to be analysed by
statistical techniques to extract a signal (if any), which will
usually be tiny over small times but may accumulate over larger
times. A background in statistics will be most useful here.
* An ideal front-end would be able to apply appropriate statistical
and data visualization techniques to cross-sections of the data to
answer questions like "have there been any statistically significant
changes to this data set (or subset) over time, and if so, when did
they occur?".
* There is likely to be significant prior art in all of this, but I
don't know what any of it is. The HDF data format
http://hdf.ncsa.uiuc.edu/ and related tools might be interesting to
investigate; but I don't really know anything about it so it might
be too heavy-weight. Perhaps some of our scientific computing users
can make some suggestions.
* Start small. You should keep an eye on the bigger picture such as
what I suggest, but don't try and bite it all off at once. For
example, you could start by limiting to recording and analysing data
sets that contain only a single data point changing over time (while
hopefully not limiting future expansion), because even that will be
a useful beginning.
Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20070815/292ea067/attachment.pgp
More information about the freebsd-current
mailing list