INDEX
Explanations
proper nouns, particularly those related to locations and people, as well as mentions of specific events or dates
New Auto-Interp
Head Attr Weights
0:0.06
1:0.49
2:0.05
3:0.03
4:0.05
5:0.02
6:0.04
7:0.06
8:0.02
9:0.02
10:0.07
11:0.04
Negative Logits
ロ
-2.67
rax
-2.49
Interstitial
-2.43
END
-2.41
GEN
-2.40
ajo
-2.40
iverse
-2.38
aired
-2.38
aido
-2.36
Winged
-2.36
POSITIVE LOGITS
Hard
8.99
Hard
8.58
hard
7.97
hard
7.63
Soft
6.38
soft
6.24
harder
6.22
Soft
5.86
hardness
5.68
soft
5.52
Activations Density 0.132%