INDEX
Explanations
technical or complex expressions related to data and computation
New Auto-Interp
Head Attr Weights
0:0.03
1:0.05
2:0.06
3:0.02
4:0.03
5:0.03
6:0.03
7:0.08
8:0.10
9:0.05
10:0.39
11:0.06
Negative Logits
Hut
-2.87
Lep
-2.78
UFF
-2.64
Del
-2.59
yip
-2.58
atever
-2.58
ERAL
-2.58
evict
-2.56
Mald
-2.53
Pe
-2.53
POSITIVE LOGITS
s
9.10
ss
5.96
s
5.17
S
4.61
ss
4.45
sb
4.26
S
4.18
sg
3.69
sd
3.61
sat
3.45
Activations Density 0.143%