INDEX
Explanations
mentions of political figures and their affiliations
mentions of California-related terms
New Auto-Interp
Negative Logits
clock
-0.65
lift
-0.65
culosis
-0.63
BOOK
-0.63
nings
-0.62
ty
-0.62
host
-0.61
VK
-0.59
Grail
-0.58
balloons
-0.58
POSITIVE LOGITS
Calif
1.23
aii
0.87
ignt
0.85
eno
0.84
qua
0.82
Calif
0.79
sylvania
0.78
uti
0.76
leans
0.75
osi
0.74
Activations Density 0.004%