INDEX
Explanations
references to spatial relationships involving the subject 'him' or 'them'
references to individuals and groups in relation to positions or actions
New Auto-Interp
Negative Logits
ãĥ©ãĥ³
-0.80
Geek
-0.70
-)
-0.69
fact
-0.69
ãĥĭ
-0.66
inctions
-0.65
ammy
-0.65
Psy
-0.64
ociate
-0.62
itialized
-0.61
POSITIVE LOGITS
atic
0.83
conduc
0.82
atics
0.81
selves
0.77
alian
0.76
tremend
0.69
dracon
0.69
unbeat
0.68
contemporaries
0.67
andering
0.66
Activations Density 0.177%