INDEX
Explanations
actions associated with pushing or making progress
New Auto-Interp
Negative Logits
whiteColor
-0.70
mycompany
-0.70
listdir
-0.63
portátil
-0.63
Ston
-0.61
cein
-0.61
amentul
-0.59
oreilles
-0.59
toner
-0.58
Coates
-0.58
POSITIVE LOGITS
push
1.89
PUSH
1.69
pushes
1.68
Pushing
1.67
Push
1.66
pushed
1.60
pushing
1.60
pusher
1.58
push
1.57
Pushing
1.54
Activations Density 0.074%