INDEX
Explanations
words related to indulgence and self-gratification
New Auto-Interp
Negative Logits
eus
-0.17
Gro
-0.16
eks
-0.16
eko
-0.16
ÙĪØ¹
-0.15
رÙĪØ´
-0.15
zza
-0.14
erness
-0.14
%M
-0.14
quat
-0.14
POSITIVE LOGITS
ging
0.60
ged
0.58
ges
0.54
ger
0.46
gers
0.44
ge
0.40
gem
0.39
GING
0.36
g
0.35
GE
0.35
Activations Density 0.022%