INDEX
Explanations
phrases related to different individuals having experience working under specific people or organizations
phrases indicating a context of performance evaluation or supervision
New Auto-Interp
Negative Logits
iterranean
-0.76
ument
-0.68
bernatorial
-0.65
ILLE
-0.62
Edison
-0.61
`
-0.60
auga
-0.60
0000000000000000
-0.60
Warsaw
-0.57
Minecraft
-0.57
POSITIVE LOGITS
pins
1.08
dogs
1.06
neath
1.04
lining
1.02
whelming
1.01
cutting
0.96
graduate
0.95
whel
0.94
lined
0.94
cuts
0.94
Activations Density 0.047%