INDEX
Explanations
phrases related to actions and decisions
phrases that indicate significant statements, actions, or conditions in a socio-political context
New Auto-Interp
Negative Logits
farious
-0.57
senal
-0.54
Saying
-0.52
laugh
-0.50
dding
-0.49
ullivan
-0.49
untled
-0.48
ULL
-0.46
Seym
-0.46
srf
-0.46
POSITIVE LOGITS
is
1.50
isn
1.39
differs
1.35
belongs
1.35
consists
1.34
exists
1.34
involves
1.33
tends
1.33
depends
1.32
has
1.32
Activations Density 1.019%