INDEX
Explanations
occurrences of the word "my" in various contexts related to personal experiences and possessions
New Auto-Interp
Negative Logits
s
-0.18
ories
-0.17
unanim
-0.15
elder
-0.14
ź
-0.14
ensed
-0.13
atos
-0.13
mi
-0.13
nya
-0.13
umbs
-0.13
POSITIVE LOGITS
riad
0.36
opic
0.33
rtle
0.32
anmar
0.31
opia
0.30
ri
0.29
myself
0.28
rrha
0.27
embros
0.26
/my
0.23
Activations Density 0.146%