INDEX
Explanations
references to social roles and structures within communities or organizations
New Auto-Interp
Negative Logits
Direction
-0.16
upside
-0.15
fewer
-0.15
doby
-0.15
пÑĢид
-0.15
Away
-0.14
Upper
-0.14
Upper
-0.14
reater
-0.14
inverted
-0.14
POSITIVE LOGITS
right
0.54
up
0.45
right
0.40
RIGHT
0.39
down
0.36
all
0.35
Right
0.35
.right
0.34
-right
0.33
Right
0.33
Activations Density 0.212%