INDEX
Explanations
names and titles associated with political and cultural figures
New Auto-Interp
Negative Logits
920
-0.15
ispecies
-0.14
angu
-0.14
Uhr
-0.14
ight
-0.13
jedn
-0.13
cif
-0.13
dÃŃv
-0.13
awn
-0.13
awa
-0.13
POSITIVE LOGITS
retirement
0.20
retired
0.17
widow
0.17
retire
0.16
Retirement
0.16
asti
0.16
retirees
0.15
STALL
0.15
bubble
0.15
anton
0.14
Activations Density 0.238%