INDEX
Explanations
repeated references to "we" and "he" in the context of religious or moral assertions
New Auto-Interp
Negative Logits
Moq
-0.74
Adair
-0.66
Egl
-0.65
campagnes
-0.63
polvere
-0.61
cumplido
-0.59
ONESIA
-0.59
themſelves
-0.59
sqlSession
-0.58
foncé
-0.58
POSITIVE LOGITS
We
1.08
He
0.78
Hochspringen
0.75
WE
0.70
She
0.63
MemoryWarning
0.63
EqualsAnd
0.59
They
0.59
}}"></
0.57
ühner
0.55
Activations Density 0.163%