INDEX
Explanations
words indicating purpose or aim
expressions related to intentions
New Auto-Interp
Negative Logits
cit
-0.84
java
-0.67
Solitaire
-0.64
Tycoon
-0.64
visors
-0.63
Figures
-0.63
Medals
-0.59
Kerr
-0.59
Wolves
-0.59
hill
-0.58
POSITIVE LOGITS
ality
1.07
ful
0.96
lessly
0.87
reprene
0.85
phis
0.84
fulness
0.82
edly
0.81
ually
0.81
ual
0.74
ãĥĨ
0.74
Activations Density 0.022%