INDEX
Explanations
discussions related to personality types and social dynamics
New Auto-Interp
Negative Logits
inkle
-0.13
Surround
-0.13
عÙĨ
-0.13
جÙĦ
-0.13
çģ
-0.13
Dank
-0.13
Dwarf
-0.13
unrelated
-0.12
zusammen
-0.12
arget
-0.12
POSITIVE LOGITS
oscill
0.39
amb
0.37
split
0.35
dich
0.32
pend
0.31
torn
0.30
splits
0.30
fluct
0.30
neither
0.30
Osc
0.29
Activations Density 0.294%