INDEX
Negative Logits
Jefus
-0.77
ſeveral
-0.75
Efq
-0.73
myſelf
-0.73
fometimes
-0.73
Theſe
-0.72
becauſe
-0.72
Monfieur
-0.71
pleaſure
-0.71
whoſe
-0.70
POSITIVE LOGITS
bib
0.62
باً
0.57
bib
0.57
Bib
0.55
المعيارى
0.52
Bib
0.52
клопе
0.52
CreateTagHelper
0.52
principalTable
0.50
semitism
0.50
Activations Density 0.003%