INDEX
Explanations
possession and relational expressions in context
New Auto-Interp
Negative Logits
nero
-0.16
teenth
-0.15
Cong
-0.14
Cors
-0.14
rika
-0.14
ropol
-0.14
sig
-0.14
ãĥ³ãĥĨãĤ£
-0.13
thro
-0.13
ç¬
-0.13
POSITIVE LOGITS
ch
0.43
cha
0.36
chw
0.36
chl
0.35
chu
0.35
pie
0.34
chie
0.34
pez
0.34
icher
0.32
chat
0.31
Activations Density 0.017%