INDEX
Explanations
punctuation marks, specifically commas
New Auto-Interp
Head Attr Weights
0:0.09
1:0.05
2:0.03
3:0.26
4:0.11
5:0.08
6:0.06
7:0.02
8:0.11
9:0.08
10:0.02
11:0.02
Negative Logits
newsletter
-2.44
Newsweek
-2.39
publication
-2.30
memos
-2.22
blog
-2.19
forthcoming
-2.10
published
-2.10
NYT
-2.08
published
-2.08
Newsletter
-2.08
POSITIVE LOGITS
oleon
2.35
stunts
2.33
ihad
2.23
Comet
2.14
ateurs
2.10
igers
2.07
azy
2.04
digy
2.03
zees
1.97
itimate
1.97
Activations Density 0.000%