INDEX
Explanations
phrases related to taking action or making choices
New Auto-Interp
Negative Logits
transparence
-0.53
únicos
-0.51
wiele
-0.49
élève
-0.48
villaggio
-0.48
únicas
-0.47
vaisselle
-0.46
supérieures
-0.46
iega
-0.46
щет
-0.45
POSITIVE LOGITS
grab
1.25
grabbing
1.19
Grab
1.19
grabbed
1.18
grab
1.17
Grab
1.14
grabs
1.09
slap
0.91
shove
0.90
slapped
0.89
Activations Density 0.269%