INDEX
Explanations
expressions related to emotions and feelings
New Auto-Interp
Negative Logits
otos
-0.16
anje
-0.16
cai
-0.15
../../../
-0.15
itals
-0.15
figure
-0.15
unos
-0.15
λοÏħ
-0.15
unidad
-0.15
esian
-0.14
POSITIVE LOGITS
sorry
0.21
lessly
0.20
ings
0.19
-good
0.19
thy
0.18
Sorry
0.17
inspace
0.17
sorry
0.17
ledged
0.16
ald
0.16
Activations Density 0.051%