INDEX
Explanations
phrases related to personal experiences and reflections on failure or success
New Auto-Interp
Negative Logits
altet
-0.15
coop
-0.15
zimmer
-0.14
оÑģÑĮ
-0.14
Ñģобой
-0.13
ennon
-0.13
kir
-0.13
erre
-0.13
mma
-0.13
åĿĬ
-0.13
POSITIVE LOGITS
few
0.58
few
0.51
couple
0.47
vÃłi
0.47
åĩł
0.42
Few
0.42
åĩłä¸ª
0.42
Few
0.41
quelques
0.41
birkaç
0.40
Activations Density 0.173%