INDEX
Explanations
references to names or terms related to notable individuals or entities
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.04
3:0.06
4:0.04
5:0.05
6:0.40
7:0.07
8:0.05
9:0.07
10:0.07
11:0.04
Negative Logits
Downloadha
-1.57
legitimately
-1.39
assets
-1.29
Finnish
-1.23
esthetic
-1.22
duct
-1.21
veins
-1.20
asking
-1.19
delegation
-1.17
shaping
-1.17
POSITIVE LOGITS
��
1.60
jong
1.52
י
1.35
rand
1.34
Royal
1.33
Prin
1.33
urga
1.32
�士
1.30
lang
1.30
bad
1.30
Activations Density 0.003%