INDEX
Explanations
evaluative words related to quality and performance
New Auto-Interp
Negative Logits
ategory
-0.85
fty
-0.81
mber
-0.75
aryn
-0.73
ridor
-0.72
gemony
-0.71
eters
-0.70
hyde
-0.69
Strait
-0.68
ipient
-0.67
POSITIVE LOGITS
enough
1.22
enough
1.16
luck
1.04
bye
0.93
karma
0.88
luck
0.87
Enough
0.85
Samar
0.84
quality
0.82
ol
0.81
Activations Density 0.114%