INDEX
Explanations
adjectives signaling high status or significance
New Auto-Interp
Negative Logits
auld
-0.77
tremend
-0.73
afforded
-0.71
accomp
-0.70
SPONSORED
-0.70
helm
-0.70
enough
-0.70
leans
-0.69
etsy
-0.68
feature
-0.67
POSITIVE LOGITS
Nit
0.79
Cutter
0.78
Cyr
0.78
Stella
0.77
Opp
0.77
PHI
0.76
Ort
0.76
Mit
0.76
Ply
0.76
STATS
0.75
Activations Density 0.187%