INDEX
Explanations
names related to politics and healthcare
references to specific individuals, particularly politicians
New Auto-Interp
Negative Logits
rained
-0.75
————————
-0.72
rex
-0.72
Meta
-0.70
mast
-0.70
company
-0.67
اÙĦ
-0.66
ĸļ
-0.66
oleon
-0.64
emet
-0.62
POSITIVE LOGITS
veyard
0.84
ayson
0.84
Grimes
0.82
iffin
0.81
banks
0.79
saf
0.77
atures
0.75
roots
0.74
veland
0.74
bly
0.73
Activations Density 0.018%