INDEX
Explanations
references to statistical data and reports from official sources
New Auto-Interp
Head Attr Weights
0:0.02
1:0.03
2:0.19
3:0.29
4:0.10
5:0.04
6:0.03
7:0.03
8:0.04
9:0.06
10:0.08
11:0.04
Negative Logits
)."
-1.96
),"
-1.50
!).
-1.46
NPR
-1.45
!'"
-1.43
]."
-1.41
.""
-1.40
).[
-1.39
!)
-1.39
Blitz
-1.38
POSITIVE LOGITS
isEnabled
1.50
BALL
1.46
eni
1.43
ui
1.40
nutshell
1.39
eno
1.39
cli
1.36
aden
1.35
comfort
1.35
gency
1.33
Activations Density 0.004%