INDEX
Explanations
news headlines, especially those involving strong statements or opinions
instances of the word "MORE" related to various topics or issues
New Auto-Interp
Negative Logits
liest
-0.87
stood
-0.86
bows
-0.85
vet
-0.70
eries
-0.69
ades
-0.68
bred
-0.68
oes
-0.67
keeping
-0.66
bow
-0.65
POSITIVE LOGITS
ado
0.95
than
0.93
VIDEOS
0.88
enhagen
0.79
HEAD
0.78
Than
0.77
INFORMATION
0.75
MORE
0.75
HUD
0.72
importantly
0.72
Activations Density 0.017%