INDEX
Explanations
punctuation, specifically commas
New Auto-Interp
Head Attr Weights
0:0.09
1:0.08
2:0.08
3:0.07
4:0.08
5:0.08
6:0.07
7:0.07
8:0.08
9:0.08
10:0.09
11:0.08
Negative Logits
Bowman
-2.64
balloons
-2.57
Idol
-2.55
Israelis
-2.52
Roma
-2.48
landers
-2.43
leneck
-2.40
blond
-2.35
Lund
-2.34
Bowie
-2.34
POSITIVE LOGITS
icio
3.04
Ty
2.80
cc
2.60
htt
2.58
Anth
2.57
Torrent
2.57
hepat
2.54
yip
2.53
raph
2.53
rop
2.51
Activations Density 0.000%