INDEX
Explanations
topics related to social dynamics and interpersonal relationships
New Auto-Interp
Negative Logits
ãĥ¼ãĥ³
-0.16
ufen
-0.16
æĭ¬
-0.15
lical
-0.15
WEEN
-0.15
ADVISED
-0.14
ogle
-0.14
anko
-0.14
é¨
-0.14
lico
-0.13
POSITIVE LOGITS
whom
0.38
nearby
0.34
around
0.28
who
0.26
near
0.25
close
0.24
near
0.23
similarly
0.22
around
0.21
nearest
0.20
Activations Density 0.183%