INDEX
Explanations
provocative or controversial statements
New Auto-Interp
Negative Logits
circulating
-0.75
flooded
-0.72
diversion
-0.71
auxiliary
-0.68
prescribed
-0.67
Haj
-0.66
Maxwell
-0.65
volunt
-0.65
floating
-0.65
hovering
-0.65
POSITIVE LOGITS
Br
3.75
br
1.73
BR
1.64
Br
1.62
Brus
1.54
Fr
1.25
Bel
1.25
Brand
1.25
Buff
1.20
Bron
1.19
Activations Density 0.010%