INDEX
Explanations
references to clerical work or positions
New Auto-Interp
Negative Logits
tte
-0.15
osta
-0.15
ille
-0.15
ersh
-0.15
e
-0.15
HD
-0.15
anmar
-0.14
chilled
-0.14
yses
-0.14
HR
-0.14
POSITIVE LOGITS
ical
0.26
gy
0.21
ks
0.20
ihan
0.18
mont
0.18
ics
0.18
kin
0.18
ken
0.17
king
0.17
ovnÃŃ
0.16
Activations Density 0.005%