INDEX
Explanations
phrases indicating desires and aspirations
New Auto-Interp
Negative Logits
ftagPool
-0.66
jspx
-0.63
ADELPHIA
-0.59
retenir
-0.57
ostrich
-0.57
Ouch
-0.56
Maestro
-0.55
ARGUMENT
-0.55
encomp
-0.54
Exactos
-0.54
POSITIVE LOGITS
want
1.03
wanted
0.96
wants
0.93
wanting
0.93
desire
0.89
Wanted
0.85
Want
0.82
WANT
0.82
want
0.79
Want
0.78
Activations Density 0.377%