INDEX
Explanations
statements or actions expressing support for something specific
mentions of support for various causes or initiatives
New Auto-Interp
Negative Logits
edin
-0.85
ashtra
-0.81
orbit
-0.73
NL
-0.71
crore
-0.69
Tracker
-0.69
nick
-0.68
ancest
-0.68
Fle
-0.67
ishi
-0.64
POSITIVE LOGITS
gotten
0.96
bidden
0.92
gery
0.90
aging
0.86
cing
0.79
geries
0.77
thood
0.73
ints
0.70
purposes
0.68
equality
0.67
Activations Density 0.100%