INDEX
Explanations
references to attitudes and cultural values
New Auto-Interp
Negative Logits
erce
-0.17
SupportedContent
-0.17
zk
-0.15
fall
-0.15
lette
-0.15
holm
-0.15
falls
-0.15
lsen
-0.15
ERY
-0.14
Sta
-0.14
POSITIVE LOGITS
toward
0.20
towards
0.20
attitudes
0.19
attitude
0.18
cnt
0.15
disposition
0.15
ãĤĵãģ©
0.15
åIJij
0.15
218
0.15
Tow
0.15
Activations Density 0.062%