INDEX
Explanations
statements related to organizational responses and individual perspectives on issues
New Auto-Interp
Negative Logits
-in
-0.20
-IN
-0.17
inas
-0.16
hurst
-0.15
instr
-0.15
ãģŁãĤī
-0.15
/in
-0.14
zy
-0.14
IN
-0.14
inka
-0.13
POSITIVE LOGITS
in
0.45
åľ¨
0.34
åľ¨
0.33
în
0.31
ÙģÙĬ
0.31
ï¼Įåľ¨
0.30
trong
0.28
à¹ĥà¸Ļ
0.27
dalam
0.26
à¹īà¹ĥà¸Ļ
0.24
Activations Density 0.373%