INDEX
Explanations
references to the human ear
references to ears in various contexts
New Auto-Interp
Negative Logits
Mub
-0.73
Mahm
-0.72
HY
-0.69
Vide
-0.67
Alam
-0.67
athan
-0.66
effective
-0.66
Proposition
-0.64
Leap
-0.64
vich
-0.64
POSITIVE LOGITS
ears
1.30
ear
1.11
piece
1.06
pieces
0.97
lobe
0.96
butt
0.94
bone
0.93
phones
0.92
wig
0.92
bell
0.91
Activations Density 0.009%