INDEX
Explanations
punctuation and special characters in the text
New Auto-Interp
Head Attr Weights
0:0.03
1:0.33
2:0.04
3:0.09
4:0.04
5:0.06
6:0.03
7:0.09
8:0.12
9:0.02
10:0.04
11:0.06
Negative Logits
millenn
-2.95
orescent
-2.94
orously
-2.68
orescence
-2.63
continual
-2.62
ourselves
-2.62
truly
-2.52
exhibitions
-2.50
ram
-2.49
士
-2.46
POSITIVE LOGITS
Arkansas
3.46
ワ
3.15
California
3.10
Idaho
3.08
Mesa
2.96
Colorado
2.89
Ark
2.88
aska
2.85
Arizona
2.81
47
2.75
Activations Density 0.001%