INDEX
Explanations
emotive language and expressions related to vulnerability and personal experiences
New Auto-Interp
Negative Logits
للمعارف
-0.66
GenerationType
-0.63
ویکیپدیا
-0.62
surla
-0.56
lizard
-0.55
chuhe
-0.53
onda
-0.52
contigo
-0.52
vere
-0.52
unjang
-0.52
POSITIVE LOGITS
impressive
0.80
tably
0.71
glorious
0.70
delightful
0.70
ciless
0.69
rited
0.68
ly
0.68
spirited
0.68
delicious
0.67
oprot
0.67
Activations Density 0.427%