INDEX
Explanations
phrases that indicate possession or ownership, particularly with the word "our."
New Auto-Interp
Negative Logits
ories
-0.17
asty
-0.16
eno
-0.16
aters
-0.16
ons
-0.15
ải
-0.15
holds
-0.15
rics
-0.15
OE
-0.14
orous
-0.14
POSITIVE LOGITS
anos
0.21
anou
0.17
tesy
0.17
uguay
0.16
SEL
0.16
imei
0.16
iginal
0.15
own
0.15
/my
0.15
mutual
0.15
Activations Density 0.170%