Powerd and est / eist functionality

Jeremy Chadwick freebsd at jdc.parodius.com
Wed Mar 24 21:36:56 UTC 2010


On Wed, Mar 24, 2010 at 01:37:06PM -0700, John Long wrote:
> I am trying to ascertain the viability of this motherboard w/
> regards to getting the power function working proper and am
> constrained by the lack of monitoring tools vs what cupid.com has
> for win with hwmonitor and cpuz (they have a dev kit also). Would
> another brand/model of mb work better? I know that most all are
> lacking in acpi function in diff ways. Maybe I am squeezing water
> out of a rock, that the cpu is at its min or 6 watts now, but I just
> do not know.

You're placing too many eggs in one basket.  Hardware monitoring is a
separate beast, and one you won't find good support for on FreeBSD.
By "hardware monitoring" I'm talking about thermals off the mainboard,
fan RPMs, CPU temperature (not on-die temps), and voltages.  Let's talk
about those first, as they're something I'm familiar with.

- These data sources are only available if the motherboard manufacturer
added a H/W monitoring I/C to their mainboard.  Consumer mainboards
are spotty as far as offering this capability.

- Each mainboard is different.  Each mainboard model, or even
subrevision, can use a completely different H/W monitoring chip.  To
make matters more complex, there are multiple models of H/W monitoring
ICs, and even multiple revision/versions of the same model that behave
completely different than their predecessor(s).

- How this chip is implemented on the mainboard is up to the
manufacturer.  Some chips only exist on the LPC bus (think ISA I/O
ports).  Some chips support SMBus.  The mainboard has to be engineered
so that the pins on the H/W IC tie in to the LPC or SMBus bus(ses).  You
don't know which is available/used unless the manufacturer states such
in their User Manual.

- In the case of LPC, the exact I/O ports must be provided by the board
manufacturer.  In the case of SMBus, the slave address must be
provided by the board manufacturer.  **YOU CANNOT GUESS THESE** despite
Linux's lm-sensors project having folks think otherwise.  Guessing
("probing") is very high risk, and involves submitting reads to a select
set of LPC I/O ports (which could be used for other things/devices), or
to a select SMBus slave address.  Some registers do things to the system
(or associated chips) on read operations; bits can get reset.  I've seen
this happen in the case of one Winbond chip where an incorrect CRxx read
resulted in the chips' watchdog firing an NMI.

- Knowing the exact model and subrevision of H/W IC is important, since
programmer of the monitoring software has to know what all the
register offsets are, how to decode the data, etc..  There is no
standard, and there are multiple manufacturers of such devices
(Nuvoton/Winbond, National Semiconductor, ON Semiconductor/Analog
Devices, SMSC, TI, and some others.  For a while companies like AMD,
nVidia, Intel, Broadcom, ALi, and VIA were making their own chips as
well).

- In the case of SMBus, the operating system must have an SMBus driver
for the system chipset (not H/W chip).  For example, Intel systems often
use ichsmb(4), AMD systems use amdsmb(4), and so on.  Without a driver
it's not feasible/possible for userland applications to talk to a device
connected to SMBus.  For sake of example, there's no SMBus driver on
FreeBSD for ServerWorks chipsets.

Getting all of this data out of the mainboard manufacturer is like
pulling teeth, especially in the case of consumer boards.  Server board
manufacturers (Supermicro, Tyan, Intel, etc.) often disclose this
information to those who request it via Technical Support.  But if you
were to mail, say, Asus for this information, it'd likely go in one ear
and out the other.

All that (negativity) said: the closest thing you'll find on FreeBSD to
interface with these chips is ports/sysutils/mbmon,
ports/sysutils/healthd, or ports/sysutils/bsdhwmon.

mbmon supports very old mainboards which utilise LPC (it's SMBus support
is broken/shoddy).  It also tries auto-probing, often gets it wrong, and
spits out readings which are incorrect/off the charts.  Sometimes it
gets things wrong and spits out values that look real but aren't.

healthd is basically mbmon with some minor changes to the core but major
changes to the surrounding user interface.

bsdhwmon is intended for servers only and only speaks to devices using
SMBus, of which I'm the author.

Are we having fun yet?

Now back to the bigger picture...

CPU temperature (assuming you have a newer AMD or Intel CPU) is
available via the coretemp(4) driver, and active CPU clock frequency is
available via the cpufreq(4) drivers and their subsets.  There is also
some very archaic (IMHO) support for temperature monitoring via ACPI,
but I believe that's mainly intended for laptops.  With regards to ACPI,
you're purely limited by what the mainboard/BIOS implementer does; many
consumer motherboard vendors have absolutely no idea what to do with a
technical support request asking they fix/improve their ACPI tables.

For coretemp(4), you'll find the thermals under dev.cpu.X.temperature
in sysctl.

For cpufreq(4), you'll find the available processor frequency levels
under dev.cpu.X.freq_levels and what the current frequency is in
dev.cpu.X.freq.

For est(4), there's dev.est.X.freq_settings but I'm not sure how to get
these to be used or how to tune them; keep reading.

> Btw: Intel is blowing out all 775 type chips. Today is about the last
> day. They want everyone on I3/I5 etc but they are not as functional re
> low tdp as 775 chips.

That's a very interesting opinion you have there.  I continue to see
LGA775 chips sold regularly all over, and new stock coming in fairly
often to major resellers online.

The i3/i5/i7 chips don't appear to offer ECC framework on their memory
controllers (which are now on-die as I'm sure you know), which is why I
plan to stay away from them for servers.  Intel's pushing Xeon for that,
which I'm not willing to switch to until the prices drop more.  There
are existing C2D and C2Q CPUs which have the exact same capabilities and
features as their Xeon counterparts yet the Xeons cost $50-100 more.
It's like SCSI all over again.

If low-as-possible TDP is all you're concerned with, buy an Atom.

> I can find very little comprehensive info on how the
> eist/est//p4tcc/powerd thing is supposed to work.  Reading source of
> powerd is not helping. Logically, if the voltage is lowered then the
> power is going to be lower. Is the voltage a function of the load
> automatically controlled by hardware and/or the bios or is it
> supposed to be an artifact of the freq being lowered by something
> like powerd? I believe the former for the latter is not working. I
> now have everything relevant in the bios enabled.

What you probably want to look at is the source to all the subset
cpufreq(4) drivers that powerd(8) speaks to.  See the cpufreq(4) man
page for details.

It seems most of us here -- myself included -- have very little
familiarity with how to get FreeBSD to use one subset driver or another
(e.g. est(4) vs. acpi_throttle(4), etc.).  Someone recently clued me in
to how to switch from acpi_throttle to est on my Intel board which
*does* attach est(4) successfully, but I haven't bothered trying it
yet:

http://lists.freebsd.org/pipermail/freebsd-stable/2010-March/055665.html
http://lists.freebsd.org/pipermail/freebsd-stable/2010-March/055666.html

The disabled=1 variables shown there are for loader.conf, by the way,
and require a reboot after adjusting.

I'd give you links to the main thread, but Dan Naumov's mail client
doesn't appear to have Reference-Id header support so every reply of
his appears as a "new" entry in the thread list.  Search for "powerd on
8.0, is it considered safe?" here:

http://lists.freebsd.org/pipermail/freebsd-stable/2010-March/thread.html

> est appears to be not working but what would happen if it were
> working?

It should lower the clock frequency of the CPU during idle times, and
increase it during load.  **HOW** it goes about adjusting the frequency
is where the different subset drivers come into play.

The attachment "error 6" message you see usually indicates the est(4)
driver doesn't have support for you specific model of CPU based on its
capabilities, or at least that's how I understand it.  John Baldwin (I
think?) or Nate Lawson might have to chime in here.

You're not the first one to report this issue.  It comes up fairly
often.  Possibly since you're digging into the code you'd like to take
up maintaining these pieces?

> I csupd to stable and rebuilt and there is no difference w/ this
> prob. I did see that I went from SATA150 (it should be SATA300) to
> udma100 sata but that is for another thread.

This is a bug/quirk of some changes in ata(4).  Your drive should be
operating at full SATA speed (probably SATA300).  You can bring this up
in another thread if you want, but it's purely cosmetic as far as I
know.  atacontrol(8) and diskinfo(8) -t and -c will come in handy.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |



More information about the freebsd-stable mailing list