INDEX
Explanations
proper nouns, particularly the name "Marshall" at different intensities
references to the name "Marshall."
New Auto-Interp
Negative Logits
Magikarp
-0.98
eds
-0.76
reen
-0.75
sis
-0.73
hire
-0.71
yr
-0.70
lish
-0.69
profits
-0.67
visual
-0.67
ours
-0.65
POSITIVE LOGITS
mallow
1.04
Marshall
0.90
Islands
0.83
Rogers
0.82
McL
0.82
ufact
0.82
Beckham
0.79
Faul
0.78
hyde
0.78
abilia
0.77
Activations Density 0.023%