INDEX
Explanations
expressions of gratitude, appreciation, and support
expressions of gratitude and appreciation
New Auto-Interp
Negative Logits
toggle
-0.75
guiActiveUn
-0.62
Worse
-0.61
umably
-0.60
edit
-0.60
udo
-0.57
filler
-0.57
Panic
-0.55
dummy
-0.55
imp
-0.54
POSITIVE LOGITS
courageous
0.83
wonderful
0.82
sacrific
0.79
compassionate
0.77
multicultural
0.77
excellence
0.76
cellence
0.75
responsibly
0.74
vibrant
0.74
magnificent
0.74
Activations Density 2.020%