INDEX
Explanations
years mentioned within parentheses
the presence of parentheses in text, likely indicating titles or references
New Auto-Interp
Negative Logits
prey
-0.82
retard
-0.79
faces
-0.77
clin
-0.76
crop
-0.74
administration
-0.74
physique
-0.74
crocod
-0.73
discrim
-0.73
swell
-0.73
POSITIVE LOGITS
formerly
1.83
which
1.67
aka
1.57
pictured
1.53
2006
1.48
2003
1.48
2004
1.47
also
1.47
1993
1.47
1998
1.46
Activations Density 0.105%