INDEX
Explanations
entities and roles related to people in authoritative positions
New Auto-Interp
Negative Logits
SharedDtor
-0.90
anyahu
-0.67
pushFollow
-0.66
propOrder
-0.66
CURIAM
-0.64
szóci
-0.61
Luego
-0.56
Infórmanos
-0.56
lenker
-0.55
irra
-0.55
POSITIVE LOGITS
branded
0.81
branding
0.74
apologised
0.70
labelled
0.62
criticised
0.58
apologise
0.58
backed
0.57
labelling
0.56
accepts
0.54
raged
0.54
Activations Density 0.153%