INDEX
Explanations
mentions of the scale or extent of various situations or phenomena
phrases that indicate magnitude or scale
New Auto-Interp
Negative Logits
erenn
-0.78
cellaneous
-0.77
ade
-0.75
waukee
-0.74
nob
-0.74
center
-0.72
pretty
-0.72
advertising
-0.71
nis
-0.69
âĨij
-0.69
POSITIVE LOGITS
their
0.80
difference
0.80
disparity
0.78
commitment
0.77
bribes
0.77
penalties
0.76
hostilities
0.76
commitments
0.75
these
0.75
penetration
0.75
Activations Density 0.146%