INDEX
Explanations
references to membership in organizations or programs
New Auto-Interp
Negative Logits
akis
-0.17
bart
-0.17
ãĢħ
-0.17
udu
-0.16
Alv
-0.16
erah
-0.15
inho
-0.15
RL
-0.15
field
-0.15
ãĤ§
-0.15
POSITIVE LOGITS
èĢħçļĦ
0.18
èĢħ
0.18
opportunities
0.17
holders
0.17
-threatening
0.16
-minded
0.16
status
0.16
ainted
0.16
èĢħãģ®
0.15
wner
0.15
Activations Density 0.080%