INDEX
Explanations
instances of the word "guilty"
New Auto-Interp
Negative Logits
gee
-0.17
ãĥ¼ãĥĢ
-0.16
roe
-0.15
alog
-0.15
ollen
-0.14
clerosis
-0.14
ierz
-0.14
alet
-0.14
alah
-0.14
erek
-0.14
POSITIVE LOGITS
ijkstra
0.17
hers
0.15
Pine
0.15
awai
0.15
yours
0.14
OURS
0.14
å¡ļ
0.14
yc
0.14
cocci
0.14
reetings
0.13
Activations Density 0.002%