INDEX
Explanations
the word "jazz."
repeated sequences of the letters "zz"
New Auto-Interp
Negative Logits
conspicuous
-0.77
lapse
-0.73
Polar
-0.72
croft
-0.69
appropriation
-0.68
Conrad
-0.65
compl
-0.64
fitness
-0.63
¥µ
-0.63
pole
-0.63
POSITIVE LOGITS
arella
1.46
ucc
1.10
zz
1.04
etta
1.01
hou
1.01
arro
0.99
azz
0.94
ella
0.93
eret
0.93
erella
0.93
Activations Density 0.032%