INDEX
Explanations
phrases describing a strong or dominant characteristic
phrases and terms relating to generalizations or common themes
New Auto-Interp
Negative Logits
utics
-0.82
!.
-0.82
apons
-0.75
=]
-0.74
*.
-0.74
`.
-0.74
ilaterally
-0.73
ustom
-0.72
Ò
-0.72
ensis
-0.71
POSITIVE LOGITS
assumption
1.19
implication
1.17
irony
1.16
rationale
1.14
question
1.14
explanation
1.08
impetus
1.06
answer
1.02
goal
1.02
motto
1.00
Activations Density 0.337%