INDEX
Explanations
punctuation marks, specifically colons
New Auto-Interp
Head Attr Weights
0:0.07
1:0.06
2:0.07
3:0.10
4:0.07
5:0.09
6:0.07
7:0.08
8:0.08
9:0.09
10:0.08
11:0.09
Negative Logits
brow
-1.74
igmatic
-1.68
abeth
-1.67
loo
-1.65
alone
-1.63
licens
-1.63
edin
-1.60
fax
-1.59
aler
-1.57
olin
-1.54
POSITIVE LOGITS
overturned
1.75
raping
1.61
Cheong
1.59
itism
1.54
blaming
1.54
「
1.54
Timeout
1.52
unaccount
1.50
pressuring
1.50
overturn
1.46
Activations Density 0.000%