INDEX
Explanations
references to political ideologies and affiliations
New Auto-Interp
Negative Logits
available
-0.83
}}}
-0.68
OUNT
-0.63
++++
-0.62
inev
-0.61
oise
-0.61
ound
-0.60
ulkan
-0.60
ector
-0.59
icon
-0.59
POSITIVE LOGITS
ded
0.71
alties
0.67
consulting
0.65
sqor
0.61
Sons
0.59
Marriott
0.57
ufact
0.57
bol
0.57
Winter
0.57
eger
0.56
Activations Density 0.168%