INDEX
Explanations
words related to legal or criminal actions and consequences
words related to violations, authority, and consequences
New Auto-Interp
Negative Logits
apest
-0.62
realise
-0.60
avail
-0.56
endeavour
-0.56
decentral
-0.56
yours
-0.54
yeah
-0.53
Ô
-0.53
defin
-0.52
)=
-0.52
POSITIVE LOGITS
during
1.19
amid
1.05
improperly
0.98
during
0.97
prematurely
0.94
inappropriately
0.92
onstage
0.89
while
0.87
midway
0.80
after
0.78
Activations Density 0.544%