INDEX
Explanations
arguments or discussions related to political agendas and public policy
New Auto-Interp
Negative Logits
Minor
-0.62
cknow
-0.62
extent
-0.59
Wish
-0.55
pport
-0.54
upkeep
-0.53
ALSE
-0.52
Seat
-0.52
Reasons
-0.52
etheless
-0.51
POSITIVE LOGITS
onto
1.82
into
1.68
into
1.38
Into
1.24
INTO
1.19
toward
1.12
overboard
1.03
towards
1.02
onto
0.96
away
0.94
Activations Density 0.702%