INDEX
Explanations
names of specific individuals, such as actors or authors
references to specific individuals, particularly Baldwin and Kaufman
New Auto-Interp
Negative Logits
arthed
-0.80
ifier
-0.80
sis
-0.75
ional
-0.75
ually
-0.72
ifiable
-0.72
arist
-0.71
uate
-0.68
arant
-0.67
EMBER
-0.66
POSITIVE LOGITS
enegger
0.80
Sapp
0.80
AFB
0.79
fman
0.76
Baldwin
0.76
Downs
0.76
Nelson
0.69
Kaufman
0.69
Henderson
0.67
Fitzgerald
0.67
Activations Density 0.048%