a new hard-drive in a 2y/o laptop

Tue Jan 4 08:56:51 UTC 2011

On Mon, 3 Jan 2011 16:31:17 -0500, Chris Brennan wrote:
[.. trimming ccs, selectively quoting and de-gmailing a bit ..]

 > On Sun, Jan 2, 2011 at 1:39 AM, Michael Powell <nightrecon at hotmail.com>
 > wrote:
 > > No. I used the of=/dev/ad4 as described above. However, I think you've hit
 > > the nail on the head on one aspect. I believe that 6.2 disk was originally
 > > set up as "dangerously dedicated". It was so long ago and I had forgotten
 > > all about it, but this does dovetail with what your are getting at.
 > 
 > That may be so for you, but nothing less then FreeBSD8.1 or a Gentoo 
 > LiveCD has touched this drive. Gentoo only yo prove to myself that a 
 > sucessful ext4 filesystem could be created. GPT/GEOM wasn't used, I 
 > used all standard disk-creation methods as described in the gentoo 
 > handbook. I also used Gentoo's gpart utilities to independatly verify 
 > that any artifacts of GPT/GEOM were removed (which they were).

Ok.  It may still be useful to see what if anything remains in the areas 
that GPT uses .. see below.

 > > was attempting to do a fresh 'minimal' install of 8.0-Release to the old 6.2
 > > disk pulled off a shelf prior to doing restore(s) of a dump from just the
 > > day before. It was only done because it could be done immediately, and a
 > > newer, larger, better replacement procured after the fact.
 > 
 > This is actually something I fear in reinstalling my other FreeBSD 
 > system, which is currently 7.3, which has been upgraded successfully 
 > from 6.1. But ports has gotten out of hand and I'm rather tired of 
 > trying to fix each port, one at a time when there are prolly hundreds 
 > currently installed.

For sure.  I'm going to wait till 7.4-RELEASE, then reinstall all of the 
ports on my now 7.4-PRERELEASE system from the set of ports (and esp. 
packages) resulting from the ports freeze .. that way there's much less 
chance of inconsistencies and mangled dependencies that often occur when 
trying to upgrade a lot of ports/packages installed some time ago ..

 > > Exact copy of error from my notes here:
 > 
 > > "Unable to find device node for /dev/ad4s1b in /dev! The creation of
 > > filesystems will be aborted." Then pressing "OK" brings this: "Couldn't make
 > > filesystems properly. Aborting."
 > 
 > Yes, this is exactly the same error I get. While that is the same, I 
 > think there is an underlying issue here that is causing my issue that 
 > doesn't exactly pertain to 'dangerously dedicated'.

Sure.

 > > This from sysinstall and occurs after fdisk, labeling, at the point when
 > > sysinstall then tries to write out the config to the disk and newfs.
 > 
 > Yerp, sysinstalls pukes at newfs/swap creation, when it can't find 
 > /dev/ad4s1b (which is swap)
 > 
 > > Or any other form of 'garbage'. I'd use the 8.1 LiveFS CD myself just as a
 > > personal preference - but either approach should do the job.
 > 
 > Well, the garbage I reported was because of a typo on my part.
 > 
 > > Yes - I agree. Would also be nice to examine it afterward with a hex editor
 > > to actually see *if* all writes were zero.  Any 'ones' sprinkled in there,
 > > especially in the region of the disk we are talking about would indicate
 > > corruption. And my wild guess if this is the situation it may possibly
 > > indicate some form of subtle hardware incompatibility most likely a clash of
 > > firmwares, e.g. controller and disk(s).  Some form of non-standard
 > > controller implementation, especially wrt to its firmware being buggy.
 > 
 > If someone provides the command for this, I will happily run it and 
 > see if the output is all zero's...

Ok, let's eg look at the first and last 'tracks' of 63 sectors.  If you 
have somewhere you can copy these to (like a USB stick) then you can do 
that and examine them on another box with hexdump(1) | less(1).  If not, 
as they're expected to be mostly zeroes, you can do it directly:

 dd if=/dev/ad4 count=71 | hd | less

Not sure if Fixit provides hd &/or less, though they appear in /usr/bin?

That shows you the boot sector ie MBR, plus anything in sectors 1-62, 
plus the first 4KB of what will be ad4s1, ie the s1 boot blocks (if 
they've been ever written there yet and haven't been since cleared)

According to your earlier report, your disk has:
 cylinders=1453521 heads=16 sectors/tracks=63 (1008 blks/cyl)
and using that (standard) geometry, reformatted a bit:
 Offset       Size(ST)          End     Name  PType    Desc    Subtype Flags
     0             63            62       -    12     unused      0
    63     1465149105    1465149167    ad4s1    8    freebsd    165

Check: 1453521 * 16 * 63 = 1465149168 sectors, numbered 0..1465149167

So since iseek=0 starts at sector 0, then iseek=1465149167 starts at the 
last sector, right?  So:

 dd if=/dev/ad4 iseek=1465149104 count=63 | hd

shows the last 63 sectors (last track) of the drive.  If this isn't all 
zeroes (which is worth knowing, and recording) then make it so with:

 dd if=/dev/zero of=/dev/ad4 oseek=1465149104 count=63

which is ok for your blank disk, but for a disk in use you should only 
zero the last 33 sectors as (way) below; there may be [meta]data before.

 > > In the OEM world of the likes of HP, DELL, etc, when this happens a lot of
 > > times they kludge together a work around driver that you can get from their
 > > tech support. It masks the hardware/firmware problem in software, and is
 > > almost always a Windows-centric thing.
 > 
 > *shudder* that's all, just *shudder*

There were also (at least used to be) reports of troubles with some SATA 
cables, and as you've replaced your HD it might be worth checking your 
cable attachments are good, nothing twisted or under sideways pressure?

 > > Bad thing here is the old: "but it worked in 7.x, only fails with 8.x...".
 > > Whenever I see _that_ I think "developer involvement/smarter people than me
 > > required...".

I have exactly that problem resuming from suspend on my Thinkpad T23 on 
all 8.x, where it worked fine from 6.1 through 7.4-PRERELEASE.  So far 
the smarter people are saying nothing; maybe I've offended some gods?

 > Well, the irony here, the failing drive is *ALSO* 8.1, I can slap 
 > that back in and fire it up, it still boots and works, I just didn't 
 > want to take the risk of the drive's cheese sliding off it's cracker.

How hard is it to replace the SATA cable in these?  I haven't time to 
hunt now, but recall a swathe of messages to -stable a couple of years 
ago about SATA problems that were entirely solved by replacing cables.

[..]

 > On Sun, Jan 2, 2011 at 2:19 AM, Ian Smith <smithi at nimnet.asn.au> wrote:
 > > On /dev/ad4, oseek=0 zeroes sector 0, the MBR including DOS partition
 > > (FreeBSD slice) table, so that would kill all the slice data, so sure,
 > > ad4s1 won't exist.  oseek=1 just zeroes an unused sector as we've seen.
 > 
 > > What you _can_ do from that state is:
 > 
 > > dd if=/dev/zero of=/dev/ad4 oseek=63 count=8
 > 
 > > which will remove the first 4K of (what will be) slice 1, in case
 > > there's a misconfigured bsdlabel there, for later.  I'm not convinced
 > > this is likely your problem, but it can't hurt before slice 1 exists (by
 > > virtue of having an entry in the MBR, when it should show up in /dev)
 > 
 > I'll give this a shot and let the list know what I find.

Again, getting a copy of what's there before zeroing may be helpful.

 > > Do you mean you dd'd the memstick.img to the external USB drive?  And
 > > that booted ok?  And sysinstall found it ok, as /dev/ad0a?  Details!
 > 
 > Haha! yes, I dd'd the memstick image to the external USB drive. It did boot
 > just fine, but not ad /dev/ad0a, it booted the drive as /dev/da0a. Which is
 > a 1gb partition, the other 59gb remained unused/unsliced. I don't have and
 > media where I could write a 1GB image to w/o wasting a DVD and just couldn't
 > justify that loss of space lol.

Sorry, typo: /dev/da0a.  Yes the images are 'hybrid' unsliced disks.  
If you check with fdisk da0 you'll see it appears as slice 4, of about 
24MB.  The boot sector is /boot/boot1 with a munged MBR entry pointing 
to itself (ie slice s4 starts at sector 0), sectors 1-7 are /boot/boot2, 
with an also munged bsdlabel in sector 1.  From an 8.1-R memstick.img:

t23% fdisk -s da0
/dev/da0: 967 cyl 64 hd 32 sec
Part        Start        Size Type Flags
   4:           0       50000 0xa5 0x80
t23% bsdlabel da0c			# (da0a whinges about size error)
# /dev/da0c:
8 partitions:
#        size   offset    fstype   [fsize bsize bps/cpg]
  a:  1852024       16    unused        0     0
  c:  1852040        0    unused        0     0         # "raw" part, don't edit

You should be able to find 1GB USB sticks for close to free these days; 
longer term sysinstall needs to be taught to boot/use sliced USB media.

 > > Given you've shown previously that s1 starts at sector 63, so will:
 > 
 > > sysctl kern.geom.debugflags=16
 > > dd if=/dev/zero of=/dev/ad4 oseek=63 count=8
 > 
 > Fixit# sysctl kern.geom.debugflags=16
 > sysctl kern.geom.debugflags: 0 -> 16
 > Fixit# dd if=/dev/zero of=/dev/ad4 oseek=63 count=8
 > 8+0 Records in
 > 8+0 records out
 > 4096 bytes transferred in 0.431880 secs (9484 bytes/sec)

Ok, so 'dd if=dev/ad4 iseek=63 count=8 | hd' should confirm it's all 
zeroes (re Mike's concern about confirming that writes are not being 
mangled).  I'm not sure Fixit has hd though, and can't boot one just 
now.  I think there's enough free space on the image to write a few megs 
to /dev/da0, I recall saving a dmesg and sysctl -a there once so I could 
view it on another box, though df already shows it as 'overfull':

/dev/da0a      923679  860995   -11210   101%    /mnt

 > > Of course that's not impossible, but you did say you'd installed some
 > > linux on it ok?  Clutching at straws, is there anything in your BIOS
 > > regarding different SATA modes you can play with? (No SATA disks here)
 > 
 > Yes, as I said in Mike's reply above, I did write a simple ext4 partition
 > to the drive just to prove to myself that it could be done (and it worked).
 > No, I've checked and rechecked, this laptop's BIOS menu is very limited in
 > detail and changable options. But nothing about SATA modes.

Ok.

 > > Something else you could try is W)riting the slice table + MBR out from
 > > the fdisk menu, then quit sysinstall and reboot.  You can do the same
 > > after labelling but before newfs'ing .. not generally recommended, but
 > > safe enough on a blank disk.
 > 
 > From the FDISK Partition Editor in sysinstall, I don't see a means to
 > actually write the slice to disk, immediatly from that menu. Same for the
 > slice editor.
 > 
 > > If you do the latter, you'll have to reenter your mount points later, so
 > > make a note of the order and size of partitions that you specified.
 > 
 > See above :P

Hmm, certainly still in 7.4-PRE there's a 'W' menu option in both fdisk 
and label screens.  It might still work, but be hidden in 8.2?  [Bruce?]

 > > Hopefully somebody else has a take on all this, I'm out of ideas ..
 > 
 > No worries, I appreciate yours and everyone elses help.

Only helpful when it actually helps :)

 > On Sun, Jan 2, 2011 at 5:19 AM, Bruce Cran <bruce at cran.org.uk> wrote:
 > > This can happen if you've had it partitioned using GPT at some point -
 > > in that case you need to use dd to zero the first _and_ last sectors of
 > > the disk.
 > 
 > So this is two dd operations, one for the first 63 bytes and one for the
 > last 63 bytes? Can you ellaborate a little? dd's more advanced operations
 > are still new to me :D

Not bytes but sectors, and the right number seems to be 33.  See below.

 > On Sun, Jan 2, 2011 at 5:22 AM, Bruce Cran <bruce at cran.org.uk> wrote:
 > > See my post later in the thread: this most likely has nothing to do
 > > with the partition layout but the fact that FreeBSD is finding an old
 > > partition scheme.
 > 
 > Later in the thread suggests a post after this one, this is none, or 
 > are you referring to another thread? If so, which one?

:)  The message you quoted immediately before this one, above.

 > On Sun, Jan 2, 2011 at 10:15 AM, Ian Smith <smithi at nimnet.asn.au> wrote:
 > > Hmm, should we bet against a gentoo install using GPT these days?
 > 
 > gpart is part of the gentoo LiveCD, I didn't use it to create any 
 > partitions, just to make sure fbsd deleted anything that might have 
 > been present. I used cfdisk to slice the drive and mkfs.ext4 and 
 > mkswap to create and write the partitions.

Ok, but you still should check the last track of the disk for cruft.

 > > Finding out about the actual disk layout in gpt(8), gpart(8) etc proving
 > > fruitless and finding nothing in Handbook, FAQ or wiki, I resorted to
 > > http://en.wikipedia.org/wiki/GUID_Partition_Table for hopefully correct
 > > information.  I hadn't even known that sectors 1-33 were used for the
 > > GPT (making Mike's zeroing of sector 1 sensible even on sliced disks),
 > > nor that the last 33 sectors were for its backup table, thanks.  So:
 > 
 > >  dd if=/dev/zero of=/dev/da4 seek=N
 > 
 > > where N is the known total number of sectors minus 34, should do it?
 > 
 > I think you mean ad4 and not da4 here .... si that's (ST)-34?
 > 1465149168-34? I'm just trying to make sure I understand what you want me
 > to do here.

Actually, double checking the maths, 1465149168 - 34 = 1465149134 but 
that actually gets you the last 34 sectors (since size-1 gets the last 1 
sector) so I should have said size - 33 = 1465149135.  Add count=33 to 
be sure.  After which, dd it back to check it's all zeroes:

 dd if=/dev/ad4 iseek=1465149135 count=33 | hd

As my brief late followup to that message pointed out, I made a TERRIBLE 
mistake there, saying skip instead of seek.  I see you incorporated that 
correction, phew.  Just to point out how bad a slip it was, what it'd do 
using skip is read - and discard - ~700GB of zeroes from the input, then 
zero the entire disk!  My new years' resolution is to 'skip using skip' 
and to only ever use the more explicit iseek and oseek from now on!

"Careful with that axe, Eugene!" -- Pink Floyd

 > > If not, we can't rule out Mike's concerns about BIOS incompatibility
 > > or such, but this sure sounds like the next thing Chris should try.
 > 
 > A BIOS incompatibility has been in the back of my mind. But given 
 > that the laptop is of a recently modern make, switching to a larger 
 > hard-drive shouldn't be this big of an issue.

Indeed it shouldn't.

Good luck, Ian