INDEX
Explanations
negations related to personal experiences or abilities
New Auto-Interp
Negative Logits
adpleegd
-0.82
Thales
-0.80
setSource
-0.78
Gibbs
-0.76
Weiss
-0.76
ebe
-0.74
PACE
-0.72
tır
-0.72
merce
-0.71
_('-0.71
POSITIVE LOGITS
isn
1.34
wasn
1.26
weren
1.25
Wasn
1.22
aren
1.21
didn
1.19
shouldn
1.17
mustn
1.17
Isn
1.17
doesn
1.15
Activations Density 0.074%