INDEX
Explanations
actions related to the transfer or giving of items or information
New Auto-Interp
Negative Logits
iras
-0.16
tes
-0.15
gross
-0.15
694
-0.15
ders
-0.14
unding
-0.14
Cabinet
-0.14
482
-0.14
neutral
-0.14
ector
-0.14
POSITIVE LOGITS
handing
0.25
handed
0.23
.Hand
0.18
iž
0.17
hiba
0.17
.hand
0.17
-hand
0.16
reins
0.15
keys
0.15
Hand
0.15
Activations Density 0.021%