INDEX
Explanations
phrases related to actions or behaviors of individuals or groups
presence of pronouns and auxiliary verbs indicating action or necessity
New Auto-Interp
Negative Logits
NSA
-0.60
oiler
-0.59
disclaimer
-0.58
ENSE
-0.57
ertodd
-0.56
igsaw
-0.56
geries
-0.55
ente
-0.55
Dmit
-0.55
Vaj
-0.54
POSITIVE LOGITS
destined
0.80
formerly
0.72
ago
0.69
frequ
0.69
already
0.68
previously
0.68
SI
0.66
"$:/
0.66
otherwise
0.66
ufact
0.66
Activations Density 0.256%