INDEX
Explanations
expressions of desire or intention regarding actions and goals
New Auto-Interp
Negative Logits
egli
-0.43
unsuccessful
-0.37
Groves
-0.37
anh
-0.36
technique
-0.35
fiche
-0.35
Pyle
-0.35
infamous
-0.35
ingenious
-0.35
DD
-0.34
POSITIVE LOGITS
want
1.30
want
1.26
WANT
1.22
Want
1.21
Want
1.16
wants
1.13
wants
1.07
Wants
1.07
WANT
1.02
wanting
0.97
Activations Density 0.181%