INDEX
Explanations
phrases related to instructions or guidelines
statements related to rules and conditions
New Auto-Interp
Negative Logits
wake
-0.79
roit
-0.74
Americans
-0.73
ahu
-0.71
alky
-0.70
aud
-0.66
Initialized
-0.65
ocrats
-0.64
atlantic
-0.62
lie
-0.61
POSITIVE LOGITS
also
1.04
optional
0.97
plenty
0.94
currently
0.89
ALSO
0.85
lots
0.81
however
0.79
optionally
0.79
adjustable
0.79
no
0.78
Activations Density 0.155%