INDEX
Explanations
mentions of specific names related to a news event or topic
mentions of specific individuals, particularly those related to a notable incident or topic
New Auto-Interp
Negative Logits
20439
-0.84
ngth
-0.66
carrot
-0.66
UCT
-0.64
oats
-0.64
APS
-0.64
REE
-0.64
itting
-0.64
redits
-0.64
CLASS
-0.64
POSITIVE LOGITS
Mate
1.08
vich
0.82
rette
0.80
Vie
0.79
eer
0.73
lette
0.73
atoon
0.73
querade
0.72
ovich
0.71
lda
0.71
Activations Density 0.019%