INDEX
Explanations
taking away, robbing, stealing
New Auto-Interp
Negative Logits
V
0.66
其他
0.62
रिक
0.61
bindung
0.57
вя
0.56
خ
0.55
ķ
0.55
Fl
0.53
त्य
0.53
kow
0.53
POSITIVE LOGITS
কেড়ে
0.72
छीन
0.65
夺
0.61
invade
0.59
prodotto
0.57
aree
0.57
graze
0.57
robbing
0.56
奪
0.55
theft
0.54
Activations Density 0.045%