INDEX
Explanations
mentions of religious sects or divisions
New Auto-Interp
Negative Logits
Burr
-0.14
achi
-0.14
anson
-0.14
ellas
-0.14
zell
-0.14
riott
-0.14
enburg
-0.13
ubu
-0.13
Fior
-0.13
.promise
-0.13
POSITIVE LOGITS
Wend
0.17
bull
0.17
dojo
0.17
515
0.16
Lamp
0.16
cal
0.15
863
0.14
uel
0.14
erty
0.14
ko
0.13
Activations Density 0.015%