INDEX
Explanations
expressions related to recognition and appreciation
expressions of personal achievement and recognition
New Auto-Interp
Negative Logits
pmwiki
-0.76
cumbers
-0.69
curs
-0.68
Alternatively
-0.67
Worse
-0.67
ensibly
-0.65
orno
-0.65
Inquiry
-0.64
warning
-0.63
predec
-0.62
POSITIVE LOGITS
."
1.14
.""
1.02
,"
1.02
.,"
1.00
.''
0.99
!"
0.94
,''
0.93
".
0.85
.",
0.84
!!"
0.84
Activations Density 0.336%