INDEX
Explanations
mentions of specific names or topics in a given context
instances of the word "mention" and its variations
New Auto-Interp
Negative Logits
quer
-0.86
sis
-0.79
emort
-0.74
kie
-0.73
¯¯
-0.70
hematic
-0.70
zers
-0.70
Ñı
-0.68
¯¯¯¯¯¯¯¯
-0.68
?????-?????-
-0.66
POSITIVE LOGITS
mentioning
1.05
mentions
0.93
lihood
0.90
mention
0.90
mentioned
0.77
names
0.74
prominently
0.73
aloud
0.71
therein
0.67
Kislyak
0.65
Activations Density 0.055%