INDEX
Explanations
sentences that express feelings of happiness and positivity
New Auto-Interp
Negative Logits
nephe
-0.75
hierogly
-0.75
hyal
-0.72
erythro
-0.72
Picchu
-0.71
causation
-0.70
ükemmel
-0.70
ediakan
-0.70
ulongan
-0.69
sahaja
-0.69
POSITIVE LOGITS
kont
0.51
kel
0.44
bes
0.44
dan
0.43
lont
0.43
ber
0.43
di
0.42
ke
0.41
dengan
0.39
meng
0.39
Activations Density 0.180%