INDEX
Explanations
mentions of the word "Rozz" with an emphasis on activations of 9 or 10
instances of the name "Rozz"
New Auto-Interp
Negative Logits
Conrad
-0.71
¥µ
-0.69
Polar
-0.69
lapse
-0.69
fitness
-0.67
Buffett
-0.65
conditioning
-0.64
croft
-0.63
appropriation
-0.63
foremost
-0.62
POSITIVE LOGITS
zz
1.16
arella
1.13
ucc
0.97
azz
0.92
ZZ
0.91
etta
0.91
ella
0.90
abba
0.90
hou
0.88
ebra
0.88
Activations Density 0.018%