Prompted by some message to the NetBSD port-sparc64 list, I started up my Ultra 45 to look if it would be easy to add missing hardware support. From a quick look, we had no support for the clock, the I²C controllers, nor some of the I²C devices.
The first device to tackle was the real-time clock. The firmware rtc node has a compatible property bq4802. The documentation for the bq4802Y/bq4802LY is available online and the chip is similar to other RTC chips. It was easy to write a driver for the chip and add it to the sparc64 configuration. My U45 now kept time:
[ 1.000000] bq4802rtc0 at ebus0 addr 100000-10000f: real time clock
Next, I looked at the I²C buses. The U45 has an i2c node with compatible properties SUNW,i2c-pic16f747 and pcf8584. I guessed that this was a pcf8584, but implemented in a pic16f747 chip and noticed a comment online that this implementation needed a delay after reads and writes. I extended the existing pcfiic ebus matching to add an extra flag to handle this, even though I'm not 100% sure that we need it.
Whilst looking at the driver and also starting to read documentation for the devices that were on the I²C bus, I noticed that we didn't implement the protocol correctly in our pcfiic driver. When we want to read from a register, we should send a write to the chip address, then a repeat start, and finally a read to the same address to retrieve the value. However, the pcfiic code was missing the repeat start step. On machines where there is another bus master, that can acquire the bus between our write and our read. If the other master happened to read a register on the same chip, then we would read the register that the other master had read, instead of the one we expected! This was the cause of the problems that I'd seen when working on the V240 environmental monitoring. After only 10 years, I had found the real problem and could remove the workaround in the adm1026 driver!
On the I²C bus connected to the pcfiic, the U45 fireware has:
[ 1.000000] iic0 at pcfiic0: I2C bus
[ 1.000000] pcagpio0 at iic0 addr 0x18: PCA9556
[ 1.000000] temperature (i2c-lm76) at iic0 addr 0x2b not configured
[ 1.000000] temperature (i2c-lm76) at iic0 addr 0x48 not configured
[ 1.000000] temperature (i2c-lm76) at iic0 addr 0x4f not configured
[ 1.000000] seeprom0 at iic0 addr 0x52: front-io-fru-prom: size 8192
[ 1.000000] seeprom1 at iic0 addr 0x53: sas-backplane-fru-prom: size 8192
[ 1.000000] spdmem0 at iic0 addr 0x57
[ 1.000000] spdmem0: FPM
[ 1.000000] spdmem0: 0 rows, 0 cols, 0 banks, 0ns tRAC, 0ns tCAC
[ 1.000000] hardware-monitor (i2c-adt7462) at iic0 addr 0x58 not configured
I grabbed the LM76 documentation and starting modifying the lmtemp driver. With a quick test, I saw strange temperature readings. Going back to my modified i2cscan program, I read the registers and compared them to the documentation. They definitely didn't match. Also, there was no device at address 0x48. After a bit of experimenting and looking at temperature sensor documentation, I discovered that the chip at address 0x2b is a LM95221 and the chip at address 0x4f is an NXP LM75A. This latter is slightly confusing, as it is pin compatible with the original LM75A, but has different temperature sensors and registers. Drivers for these were straightforward and I could extend the existing OFW patching routines to remove the temperature entries and to add properties for these two chips instead.
The chip at address 0x57 was being matched by the spdmem driver because it had a compatible entry of i2c-at34c02. I initially assumed that it should be i2c-at24c64 like the other fru-prom entries, but I couldn't make sense of the values read from the chip. After a bit of digging, I noticed that for the 8K AT24C64 chip, the driver reads 2 bytes at a time. However, if I manually tried to read 1 byte at a time, the values were correct. So, the entry should be i2c-at24c02. The PSU has a smaller sized (256 bytes) eeprom than the other components:
dd if=/dev/seeprom2 bs=1 count=256 | hexdump -C
00000000 08 00 01 10 34 01 43 44 00 00 41 b6 00 d7 00 29 |....4.CD..A....)|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000000d0 00 00 00 00 00 00 00 c0 2b 38 38 35 2d 30 34 36 |........+885-046|
000000e0 38 2d 30 32 c0 35 43 b6 4d 07 33 30 30 31 38 30 |8-02.5C.M.300180|
000000f0 30 35 32 31 36 39 34 05 ab 30 32 0c ab df a1 e4 |0521694..02.....|
00000100
256+0 records in
256+0 records out
256 bytes transferred in 1.328 secs (192 bytes/sec)
The last device on this bus was the i2c-adt7462. This is quite a complicated chip, with fan control, and temperature and voltage readings. I looked initially at the dbcool driver, because that supported several chips with similar numbers. However, the ADT7462 appeared different enough to need its own driver, but I could use the dbcool driver and the adm1026hm drivers as inspiration for the new driver. With the sensor names matched to Sun names, the output now looks like:
[ 1.000000] pcfiic0 at ebus0 addr 80-81 ipl 7c1
[ 1.000000] iic0 at pcfiic0: I2C bus
[ 1.000000] pcagpio0 at iic0 addr 0x18: PCA9556
[ 1.000000] seeprom0 at iic0 addr 0x52: front-io-fru-prom: size 8192
[ 1.000000] seeprom1 at iic0 addr 0x53: sas-backplane-fru-prom: size 8192
[ 1.000000] seeprom2 at iic0 addr 0x57: psu-fru-prom: size 256
[ 1.000000] adt7462sm0 at iic0 addr 0x58: ADT7462 system monitor: rev. 0x4
[ 1.000000] adt7462sm0: 5 fans, 4 temperatures, 6 voltages
[ 1.000000] lm95221ts0 at iic0 addr 0x2b: LM95221 temperature sensor
[ 1.000000] nxp75a0 at iic0 addr 0x4f: NXP LM75A temperature sensor
# envstat
Current CritMax WarnMax WarnMin CritMin Unit
[adt7462sm0]
cpu0-fan: 6609 88 RPM
cpu1-fan: 11111 88 RPM
pci-fan: 3333 88 RPM
system-fan3: 3292 88 RPM
system-fan4: 3292 88 RPM
adt7462-sensor: 48.500 80.000 75.000 0.000 degC
cpu0-sensor: 55.750 95.000 70.000 50.000 degC
cpu1-sensor: 77.000 95.000 70.000 50.000 degC
mb-sensor: 46.750 70.000 65.000 0.000 degC
V1.5 1: 1.490 1.989 0.250 V
V1.5 2: 1.498 1.989 V
V3.3 1: 1.238 4.386 V
V3.3 2: 1.232 3.978 V
V12 3: 11.938 15.938 V
V5 1: 5.018 6.630 V
fan fault: 0 0 0 0 0 none
[lm95221ts0]
lm95221-sensor: 36.750 degC
fire-sensor: 56.375 degC
lsi1064-sensor: 64.500 degC
[nxp75a0]
psu-sensor: 36.750 80.000 degC
The next device was the Fire I²C controller. This is based on a Mentor Graphics controller and there is documentation for the "Inventra MI2C Product Specification" and very similar text is in the "Fire Programmer's Reference Manual for Fire 2.1". The I²C clock rate programming and the chip flow through the I²C sequences were different from other drivers, but the most complicated part was the interrupt support. The chip uses two interrupts on one Pyro leaf, but we have no interrupt support for devices attached directly to mainbus. However, it's possible to poll for status changes (also needed at startup) so the driver has no interrupt support for now. Output on my U45 now looks like:
[ 1.000000] firei2c0 at mainbus0: addr 4000fd20000: Fire/MI2C i2c controller
[ 1.000000] iic1 at firei2c0: I2C bus
[ 1.000000] spdmem0 at iic1 addr 0x50
[ 1.000000] spdmem0: DDR SDRAM (registered), data ECC, 2GB, 333MHz (PC-2700)
[ 1.000000] spdmem0: 13 rows, 12 cols, 2 ranks, 4 banks/chip, 6.0ns cycle time
[ 1.000000] spdmem0: tAA-tRCD-tRP-tRAS: 1-3-3-7
[ 1.000000] spdmem0: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.000000] spdmem1 at iic1 addr 0x51
[ 1.000000] spdmem1: DDR SDRAM (registered), data ECC, 2GB, 333MHz (PC-2700)
[ 1.000000] spdmem1: 13 rows, 12 cols, 2 ranks, 4 banks/chip, 6.0ns cycle time
[ 1.000000] spdmem1: tAA-tRCD-tRP-tRAS: 1-3-3-7
[ 1.000000] spdmem1: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.000000] spdmem2 at iic1 addr 0x52
[ 1.000000] spdmem2: DDR SDRAM (registered), data ECC, 2GB, 333MHz (PC-2700)
[ 1.000000] spdmem2: 13 rows, 12 cols, 2 ranks, 4 banks/chip, 6.0ns cycle time
[ 1.000000] spdmem2: tAA-tRCD-tRP-tRAS: 1-3-3-7
[ 1.000000] spdmem2: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.000000] spdmem3 at iic1 addr 0x53
[ 1.000000] spdmem3: DDR SDRAM (registered), data ECC, 2GB, 333MHz (PC-2700)
[ 1.000000] spdmem3: 13 rows, 12 cols, 2 ranks, 4 banks/chip, 6.0ns cycle time
[ 1.000000] spdmem3: tAA-tRCD-tRP-tRAS: 1-3-3-7
[ 1.000000] spdmem3: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.000000] spdmem4 at iic1 addr 0x54
[ 1.000000] spdmem4: DDR SDRAM (registered), data ECC, 2GB, 333MHz (PC-2700)
[ 1.000000] spdmem4: 13 rows, 12 cols, 2 ranks, 4 banks/chip, 6.0ns cycle time
[ 1.000000] spdmem4: tAA-tRCD-tRP-tRAS: 1-3-3-7
[ 1.000000] spdmem4: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.000000] spdmem5 at iic1 addr 0x55
[ 1.000000] spdmem5: DDR SDRAM (registered), data ECC, 2GB, 333MHz (PC-2700)
[ 1.000000] spdmem5: 13 rows, 12 cols, 2 ranks, 4 banks/chip, 6.0ns cycle time
[ 1.000000] spdmem5: tAA-tRCD-tRP-tRAS: 1-3-3-7
[ 1.000000] spdmem5: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.000000] spdmem6 at iic1 addr 0x56
[ 1.000000] spdmem6: DDR SDRAM (registered), data ECC, 2GB, 333MHz (PC-2700)
[ 1.000000] spdmem6: 13 rows, 12 cols, 2 ranks, 4 banks/chip, 6.0ns cycle time
[ 1.000000] spdmem6: tAA-tRCD-tRP-tRAS: 1-3-3-7
[ 1.000000] spdmem6: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.000000] spdmem7 at iic1 addr 0x57
[ 1.000000] spdmem7: DDR SDRAM (registered), data ECC, 2GB, 333MHz (PC-2700)
[ 1.000000] spdmem7: 13 rows, 12 cols, 2 ranks, 4 banks/chip, 6.0ns cycle time
[ 1.000000] spdmem7: tAA-tRCD-tRP-tRAS: 1-3-3-7
[ 1.000000] spdmem7: voltage SSTL 2.5V, refresh time 7.8us (self-refreshing)
[ 1.000000] firei2c1 at mainbus0: addr 4000fd30000: Fire/MI2C i2c controller
[ 1.000000] iic2 at firei2c1: I2C bus
[ 1.000000] seeprom3 at iic2 addr 0x51: motherboard-fru-prom: size 8192
[ 1.000000] pcagpio1 at iic2 addr 0x18: PCA9556
[ 1.000000] clock-generator (i2c-ics9fg108) at iic2 addr 0x6e not configured
There are two PCA9556 GPIO's (one on pcfiic, on on firei2c) so I wondered if they controlled the LED's like on other Sun machines. Testing on the GPIO on firei2c showed that, indeed, the power and the fault LED's could be controlled. Along with the controls for the fan speeds, the output on my U45 is:
# sysctl hw.adt7462sm0 hw.led
hw.adt7462sm0.pwm1.channel = remote1_dynamic
hw.adt7462sm0.pwm2.channel = remote2_dynamic
hw.adt7462sm0.pwm3.channel = remote3
hw.adt7462sm0.pwm4.duty_cycle = 14
hw.adt7462sm0.pwm4.channel = manual
hw.adt7462sm0.remote1.tmin = 29
hw.adt7462sm0.remote1.trange = 32
hw.adt7462sm0.remote1.oppoint = 60
hw.adt7462sm0.remote2.tmin = 22
hw.adt7462sm0.remote2.trange = 32
hw.adt7462sm0.remote2.oppoint = 60
hw.adt7462sm0.remote3.tmin = 40
hw.adt7462sm0.remote3.trange = 32
hw.led.power = 1
hw.led.fault = 0
From this work, I noticed that the temperature on CPU1 on my U45 is a lot higher, which explains why the fan is running at full speed. At some point, I will need to renew the thermal paste. There are also three additional I²C devices, but I can't work out what they are. I wondered if one was the temperature sensor for the front IO board, which is available when running Solaris, but none give readings that look like temperatures. For the devices at addresses 0x4c and 0x4d, trying a read of register 0xf0 causes a power cycle, so these must be connected to some power monitoring circuits (and maybe don't have multiple registers, so it's writing with the top bit set that causes the power cycle).
-^- More notes -^-