INDEX
Explanations
references to political entities or organizations
New Auto-Interp
Negative Logits
myſelf
-0.73
ſtate
-0.71
Monfieur
-0.69
itſelf
-0.68
purpoſe
-0.68
Jefus
-0.67
himſelf
-0.66
Eſ
-0.66
ſhould
-0.66
themſelves
-0.65
POSITIVE LOGITS
fjspx
0.54
Vidite
0.52
">//
0.49
#
0.46
gonic
0.46
Top
0.46
Rhestr
0.46
qrt
0.46
hen
0.45
Dun
0.45
Activations Density 0.267%