INDEX
Explanations
phrases related to significant events or dramatic actions
phrases related to legal issues and political situations
New Auto-Interp
Negative Logits
their
-0.81
they
-0.76
*.
-0.67
THEY
-0.67
theirs
-0.65
They
-0.64
$.
-0.63
THEIR
-0.63
)).
-0.63
They
-0.62
POSITIVE LOGITS
himself
1.13
his
0.85
athlet
0.83
herself
0.82
onstage
0.76
backstage
0.75
iership
0.75
His
0.75
coaching
0.74
solo
0.73
Activations Density 0.684%