INDEX
Explanations
punctuation marks and special characters in contexts that may indicate structure in the text
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.03
3:0.10
4:0.04
5:0.03
6:0.06
7:0.41
8:0.03
9:0.04
10:0.05
11:0.13
Negative Logits
arin
-1.62
FSA
-1.53
anol
-1.52
asus
-1.50
seminars
-1.44
Pent
-1.44
Camel
-1.40
II
-1.39
iege
-1.36
sen
-1.36
POSITIVE LOGITS
goal
1.61
stre
1.57
acio
1.56
laim
1.50
Publisher
1.48
ritic
1.42
grave
1.40
oire
1.37
grate
1.35
rever
1.35
Activations Density 0.004%