INDEX
Explanations
phrases related to expressing strong opinions or directives
New Auto-Interp
Negative Logits
uthor
-0.83
OOL
-0.74
ESE
-0.70
DAQ
-0.70
natureconservancy
-0.69
ctic
-0.67
soever
-0.66
Seym
-0.65
HER
-0.64
ÄŁ
-0.64
POSITIVE LOGITS
blank
0.94
iasis
0.89
point
0.86
Reyes
0.85
blank
0.84
lessly
0.83
points
0.81
deduction
0.80
pointing
0.78
entin
0.76
Activations Density 4.807%