INDEX
Explanations
connections and interactions among characters or individuals in a narrative
New Auto-Interp
Negative Logits
onda
-0.15
èĬ¯
-0.15
awi
-0.15
nÃło
-0.14
izon
-0.14
Miz
-0.14
Writes
-0.14
himself
-0.13
Solomon
-0.13
ernetes
-0.13
POSITIVE LOGITS
pector
0.16
gether
0.16
iado
0.14
kork
0.14
darwin
0.14
antan
0.14
amel
0.14
birlikte
0.14
ninh
0.13
两人
0.13
Activations Density 0.400%