INDEX
Explanations
mentions of prestigious awards, specifically the Nobel Prize
references to prestigious awards, particularly the Nobel Prize and the Pulitzer Prize
New Auto-Interp
Negative Logits
den
-0.82
rir
-0.72
icago
-0.66
alter
-0.65
strong
-0.65
nes
-0.63
aturdays
-0.63
icas
-0.61
ocal
-0.61
pher
-0.61
POSITIVE LOGITS
Prize
1.42
laureate
1.35
prize
1.07
Winners
1.06
Winner
1.05
prizes
0.99
awarded
0.95
award
0.92
Winner
0.92
winner
0.90
Activations Density 0.007%