INDEX
Explanations
references to Pulitzer Prizes and their winners
New Auto-Interp
Head Attr Weights
0:0.39
1:0.01
2:0.06
3:0.11
4:0.03
5:0.13
6:0.02
7:0.04
8:0.06
9:0.02
10:0.03
11:0.02
Negative Logits
Lego
-2.62
Iw
-2.59
Ridley
-2.58
Sega
-2.43
Whedon
-2.32
Tant
-2.29
Constantin
-2.27
Hera
-2.23
Atari
-2.23
Konami
-2.19
POSITIVE LOGITS
ribune
2.80
Pulitzer
2.76
Reporting
2.73
dataset
2.37
opol
2.36
veyard
2.35
journal
2.34
icter
2.33
owler
2.32
reader
2.29
Activations Density 0.001%