INDEX
Explanations
commands or instructions related to taking action
New Auto-Interp
Negative Logits
bestos
-0.77
zona
-0.73
anmar
-0.68
ushima
-0.67
ivia
-0.66
plane
-0.64
plateau
-0.64
gins
-0.63
subt
-0.63
ento
-0.62
POSITIVE LOGITS
Fight
3.10
Brave
1.61
Messenger
1.44
messenger
1.42
Nim
1.34
Commander
1.24
imble
1.19
Brave
1.13
provoked
1.00
Clash
0.96
Activations Density 0.051%