INDEX
Explanations
references to shared values and collaboration among groups or individuals
New Auto-Interp
Negative Logits
lier
-0.22
Crush
-0.19
uars
-0.17
adin
-0.15
merch
-0.15
Grat
-0.15
öt
-0.14
ensch
-0.14
Kurd
-0.14
ombo
-0.14
POSITIVE LOGITS
/Common
0.17
åĬ¡
0.16
/shared
0.16
941
0.15
åĭĻ
0.15
ertz
0.14
common
0.14
985
0.14
psc
0.14
esso
0.14
Activations Density 0.113%