INDEX
Explanations
expressions of potentiality and effects, especially in relation to actions and outcomes
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.08
3:0.13
4:0.08
5:0.04
6:0.31
7:0.06
8:0.03
9:0.04
10:0.07
11:0.08
Negative Logits
Ys
-1.38
ITNESS
-1.34
atform
-1.33
terday
-1.26
wolves
-1.22
gaard
-1.20
ashi
-1.19
Zak
-1.18
Ruk
-1.17
gemony
-1.16
POSITIVE LOGITS
odder
1.59
depending
1.57
depending
1.52
easily
1.41
siph
1.34
taboola
1.33
quite
1.33
gettable
1.32
xtap
1.29
Depending
1.26
Activations Density 0.111%