INDEX
Explanations
mentions of politicians from California
New Auto-Interp
Negative Logits
VK
-0.70
ty
-0.66
Grail
-0.66
CMS
-0.65
clock
-0.60
lift
-0.60
Drawn
-0.59
HIP
-0.58
ebin
-0.58
Hussain
-0.58
POSITIVE LOGITS
Calif
1.24
ignt
0.92
sylvania
0.87
utics
0.80
luent
0.79
ortium
0.79
aii
0.77
Calif
0.75
Californ
0.74
qua
0.72
Activations Density 0.007%