INDEX
Explanations
proper nouns and titles
mentions of names related to media organizations or publications
New Auto-Interp
Negative Logits
omorphic
-0.63
earable
-0.56
Procedure
-0.55
govtrack
-0.55
obin
-0.55
Question
-0.54
HAR
-0.54
isks
-0.51
/(
-0.51
Detect
-0.51
POSITIVE LOGITS
favourites
0.85
countless
0.85
few
0.83
Others
0.83
favorites
0.82
innumerable
0.81
others
0.81
Examples
0.80
examples
0.79
many
0.74
Activations Density 0.110%