INDEX
Explanations
references to interactions and relationships between people
New Auto-Interp
Negative Logits
åĩ
-0.17
.scalablytyped
-0.15
onald
-0.15
ogi
-0.15
eti
-0.15
pees
-0.15
kir
-0.14
kazy
-0.14
conom
-0.14
lund
-0.13
POSITIVE LOGITS
æ»ħ
0.16
others
0.15
lined
0.14
Fabric
0.14
ering
0.14
sana
0.14
ulence
0.14
acre
0.13
äl
0.13
other
0.13
Activations Density 0.459%