INDEX
Explanations
names followed by a title "retired" in a text
mentions of individuals who are retired
New Auto-Interp
Negative Logits
iola
-0.71
Grimm
-0.67
ebus
-0.65
Hispan
-0.63
coord
-0.63
Rig
-0.62
owitz
-0.62
hran
-0.61
similarities
-0.61
arta
-0.61
POSITIVE LOGITS
retiring
0.88
retire
0.77
uting
0.75
spective
0.74
rifice
0.74
dinand
0.72
unte
0.72
ret
0.72
utes
0.70
Veter
0.70
Activations Density 0.017%