INDEX
Explanations
negative contractions indicating denial or negation
New Auto-Interp
Negative Logits
er
-0.74
Thales
-0.74
tır
-0.73
_('-0.73
PACE
-0.71
alá
-0.70
Meyer
-0.69
Gibbs
-0.68
Weiss
-0.67
eze
-0.67
POSITIVE LOGITS
shouldn
1.05
isn
1.04
wasn
1.01
mustn
1.01
couldn
1.00
Shouldn
0.99
hadn
0.98
didn
0.97
aren
0.97
__":
0.95
Activations Density 0.075%