INDEX
Explanations
wishes or desires expressed in the text
expressions of desire or hope for better outcomes
New Auto-Interp
Negative Logits
etitive
-0.71
lishes
-0.69
Solitaire
-0.68
advertisement
-0.60
competitive
-0.60
ulators
-0.59
ulated
-0.59
ulating
-0.58
Ranked
-0.58
ç«
-0.58
POSITIVE LOGITS
ful
0.93
bone
0.90
bones
0.84
mares
0.83
fully
0.79
FUL
0.75
erey
0.74
lists
0.73
aloud
0.72
llor
0.71
Activations Density 0.018%