INDEX
Explanations
proper nouns and significant names
New Auto-Interp
Head Attr Weights
0:0.05
1:0.38
2:0.03
3:0.04
4:0.03
5:0.18
6:0.05
7:0.02
8:0.03
9:0.05
10:0.05
11:0.03
Negative Logits
Phill
-1.90
Cold
-1.83
psi
-1.81
Planet
-1.77
pengu
-1.69
thumbnails
-1.68
Phill
-1.67
Wal
-1.66
press
-1.65
549
-1.63
POSITIVE LOGITS
ad
2.73
ads
2.36
adan
2.22
adh
2.21
atar
2.19
ado
2.17
adish
2.10
aden
2.02
ader
2.02
Fas
1.89
Activations Density 0.004%