INDEX
Explanations
mentions of the Centers for Disease Control and Prevention (CDC) and related health organizations
New Auto-Interp
Negative Logits
ogra
-0.16
thon
-0.16
.Shapes
-0.15
èĩ
-0.14
IID
-0.14
Ment
-0.14
hton
-0.14
ereo
-0.14
ivation
-0.14
ont
-0.14
POSITIVE LOGITS
.gov
0.18
chains
0.15
igner
0.15
rava
0.15
_tokenize
0.14
ognito
0.14
enic
0.14
ovsky
0.14
oyer
0.14
LabelText
0.14
Activations Density 0.007%