INDEX
Explanations
actions of physical interactions or contact
New Auto-Interp
Negative Logits
ſelves
-0.62
ſelf
-0.61
pleaſure
-0.60
Anſ
-0.60
ſever
-0.58
Inſ
-0.56
Conſ
-0.56
Eſ
-0.55
noDo
-0.55
onFailure
-0.54
POSITIVE LOGITS
hands
0.87
hand
0.77
tangan
0.76
Hands
0.74
Hände
0.71
Hands
0.70
rę
0.70
HANDS
0.68
fingers
0.67
Händen
0.66
Activations Density 0.201%