INDEX
Explanations
verbs and related terms suggesting action or functionality
New Auto-Interp
Head Attr Weights
0:0.03
1:0.03
2:0.33
3:0.03
4:0.10
5:0.03
6:0.03
7:0.08
8:0.03
9:0.05
10:0.14
11:0.08
Negative Logits
arat
-1.29
imar
-1.21
Breat
-1.18
Reincarn
-1.16
Wyn
-1.15
misunder
-1.09
ukong
-1.08
��
-1.08
bek
-1.07
overc
-1.04
POSITIVE LOGITS
cli
1.21
iest
1.18
aign
1.16
olicy
1.15
hire
1.12
azo
1.12
SHIP
1.11
uses
1.11
udence
1.10
¶
1.07
Activations Density 0.176%