INDEX
Explanations
commands or instructions starting with "If you want to"
phrases expressing desires or intentions
New Auto-Interp
Negative Logits
schild
-0.78
ilings
-0.73
osi
-0.71
ilian
-0.67
sbm
-0.66
usted
-0.66
agar
-0.66
announced
-0.66
ikh
-0.65
rup
-0.64
POSITIVE LOGITS
something
0.87
to
0.86
proof
0.78
clarification
0.76
anything
0.74
someone
0.74
yourself
0.73
fuller
0.73
anonymity
0.73
more
0.71
Activations Density 0.068%