INDEX
Explanations
phrases indicating positive evaluations or quality assessments
New Auto-Interp
Negative Logits
Schaefer
-0.88
onItemClick
-0.81
whoſe
-0.80
Nugent
-0.80
bootstrapcdn
-0.75
ᾶ
-0.74
ობ
-0.74
DiCaprio
-0.74
оригіналу
-0.73
jstor
-0.72
POSITIVE LOGITS
well
1.37
WELL
1.37
Well
1.32
well
1.27
Well
1.26
Wells
1.25
wells
1.20
Wells
1.19
Welles
1.15
WELLS
1.09
Activations Density 0.104%