INDEX
Explanations
expressions of appreciation and admiration
New Auto-Interp
Negative Logits
IsContent
-0.88
aggressiveness
-0.67
Pronunciation
-0.64
sév
-0.63
Bronnen
-0.60
elbows
-0.60
unsuccessfully
-0.58
azért
-0.58
ketat
-0.58
interesa
-0.58
POSITIVE LOGITS
wonderful
1.87
wonderful
1.69
Wonderful
1.69
Wonderful
1.65
marvelous
1.47
lovely
1.38
amazing
1.33
marvellous
1.32
beautiful
1.31
wonderfully
1.26
Activations Density 0.106%