INDEX
Explanations
mentions of animal horns
references to rhino horns
New Auto-Interp
Negative Logits
Yoga
-0.72
Memor
-0.70
Bowling
-0.69
reading
-0.69
NY
-0.69
agall
-0.66
Coco
-0.64
volunt
-0.64
laundry
-0.62
Beg
-0.62
POSITIVE LOGITS
horns
3.89
horn
3.49
horn
2.65
Horn
2.21
trumpet
1.43
sax
1.23
Rhino
1.14
whistle
1.14
roar
1.02
hump
1.01
Activations Density 0.034%