INDEX
Explanations
expressions of personal ownership or possession
New Auto-Interp
Negative Logits
pa
-0.15
oy
-0.15
275
-0.15
ood
-0.14
pig
-0.14
al
-0.14
Diet
-0.14
ov
-0.14
зал
-0.14
s
-0.13
POSITIVE LOGITS
vant
0.16
磨
0.15
bote
0.15
kud
0.15
iš
0.15
herits
0.15
rrha
0.14
tesy
0.14
ÅĻeh
0.14
ibi
0.14
Activations Density 0.045%