INDEX
Explanations
numerical references or rankings
New Auto-Interp
Head Attr Weights
0:0.14
1:0.04
2:0.02
3:0.11
4:0.08
5:0.08
6:0.12
7:0.03
8:0.20
9:0.08
10:0.02
11:0.03
Negative Logits
motion
-1.91
lihood
-1.83
pse
-1.80
¶
-1.69
tsky
-1.67
perjury
-1.67
etc
-1.65
Lect
-1.62
fallacy
-1.62
Rothschild
-1.61
POSITIVE LOGITS
��
1.96
魔
1.93
ドラ
1.89
1.83
WithNo
1.82
Mini
1.82
Reviewed
1.82
�
1.81
scrib
1.80
�
1.80
Activations Density 0.000%