INDEX
Explanations
expressions related to hope and positivity in difficult situations
New Auto-Interp
Negative Logits
T
-0.69
<eos>
-0.66
НОЙ
-0.64
u
-0.59
}}],
-0.58
p
-0.58
НЫХ
-0.58
НОСТИ
-0.57
t
-0.57
g
-0.57
POSITIVE LOGITS
houſe
0.78
ſtate
0.73
cauſe
0.72
исленность
0.70
ſta
0.69
itſelf
0.67
purpoſe
0.65
ſind
0.65
auffi
0.65
ainfi
0.65
Activations Density 1.884%