INDEX
Explanations
questions and reflections about societal expectations and personal circumstances, particularly related to wealth and self-worth
New Auto-Interp
Negative Logits
ãĥ¼ãĥª
-0.14
agoon
-0.14
iž
-0.14
大éĩı
-0.14
fcn
-0.13
habi
-0.13
ories
-0.13
Longrightarrow
-0.13
rating
-0.13
ANNER
-0.13
POSITIVE LOGITS
even
0.48
even
0.41
Even
0.35
Even
0.34
sogar
0.34
çĶļèĩ³
0.33
EVEN
0.33
даже
0.33
barely
0.31
tháºŃm
0.29
Activations Density 0.290%