INDEX
Explanations
occurrences of punctuation marks, particularly commas
New Auto-Interp
Head Attr Weights
0:0.06
1:0.02
2:0.07
3:0.06
4:0.06
5:0.04
6:0.25
7:0.03
8:0.08
9:0.04
10:0.17
11:0.06
Negative Logits
BIL
-1.74
urious
-1.65
ocious
-1.56
urgical
-1.54
selage
-1.54
pload
-1.53
ufact
-1.51
oulos
-1.47
ャ
-1.46
oaded
-1.41
POSITIVE LOGITS
USSR
1.84
Taiwan
1.61
Arabia
1.50
Thur
1.49
Country
1.48
Congo
1.45
USA
1.44
Country
1.43
borders
1.40
respective
1.38
Activations Density 0.001%