INDEX
Explanations
phrases related to job titles and positions
New Auto-Interp
Negative Logits
surgeons
-0.67
rums
-0.64
stret
-0.64
Attach
-0.63
idiots
-0.61
needles
-0.61
therapists
-0.60
unbeliev
-0.59
Duration
-0.59
fools
-0.58
POSITIVE LOGITS
emer
1.13
extraord
0.81
overseeing
0.79
utive
0.77
Emer
0.75
itatively
0.75
director
0.74
hesis
0.71
of
0.70
wark
0.67
Activations Density 0.592%