INDEX
Explanations
references to representatives or politicians, particularly in the context of legislation or political actions
New Auto-Interp
Negative Logits
hal
-0.15
ernen
-0.15
va
-0.15
erb
-0.15
andro
-0.15
q
-0.14
fixes
-0.14
teas
-0.14
akan
-0.14
amaz
-0.14
POSITIVE LOGITS
tember
0.16
alink
0.15
UGE
0.15
amedi
0.15
deaux
0.14
nik
0.14
/xhtml
0.14
adele
0.14
èĭı
0.14
ëĭĪìĬ¤
0.14
Activations Density 0.016%