INDEX
Explanations
exclamations and positive affirmations
expressions of positivity or approval
New Auto-Interp
Negative Logits
iaz
-0.73
guiActiveUn
-0.69
ascus
-0.68
ural
-0.67
urities
-0.65
IRE
-0.65
gypt
-0.65
uterte
-0.65
rose
-0.65
inez
-0.64
POSITIVE LOGITS
albeit
0.94
albeit
0.81
congratulations
0.79
huh
0.75
enough
0.71
insofar
0.69
but
0.69
Especially
0.69
BUT
0.69
âĹ¼
0.67
Activations Density 0.556%