INDEX
Explanations
advice or recommendations in the form of commands or suggestions
phrases indicating advice or recommendations
New Auto-Interp
Negative Logits
NetMessage
-0.93
eur
-0.75
âĹ¼
-0.71
oid
-0.69
cible
-0.60
SON
-0.59
oids
-0.59
Niet
-0.58
Pear
-0.57
oidal
-0.57
POSITIVE LOGITS
beware
0.75
hurry
0.75
scram
0.66
organise
0.66
insure
0.66
raise
0.66
rity
0.65
lett
0.65
rack
0.65
luck
0.64
Activations Density 0.050%