INDEX
Explanations
mentions of political affiliations and actions in the context of Senate proceedings
New Auto-Interp
Negative Logits
Sloan
-0.15
imbalance
-0.15
ooks
-0.14
atican
-0.14
lassen
-0.14
-pos
-0.14
rush
-0.14
aign
-0.13
ibraltar
-0.13
łģ
-0.13
POSITIVE LOGITS
modifiable
0.14
@student
0.14
ิà¸ļ
0.14
istrovstvÃŃ
0.14
بس
0.14
jection
0.14
tridges
0.14
пÑĢим
0.14
-Sah
0.13
ØŃÙĦ
0.13
Activations Density 0.023%