INDEX
Explanations
instances where something is mentioned, typically to emphasize importance or highlight information
instances of the word "mention" in a variety of contexts
New Auto-Interp
Negative Logits
sett
-0.77
¯¯¯¯¯¯¯¯
-0.74
sled
-0.72
uilt
-0.69
orneys
-0.68
Drag
-0.68
waged
-0.66
gamer
-0.66
millenn
-0.66
¯¯
-0.65
POSITIVE LOGITS
mentions
0.97
mentioning
0.96
mention
0.92
lihood
0.88
fulness
0.84
ibility
0.77
cliffe
0.77
aloud
0.76
lessness
0.76
lessly
0.75
Activations Density 0.020%