INDEX
Explanations
news headlines that include additional information
instances of the word "More" indicating ongoing or additional information
New Auto-Interp
Negative Logits
uca
-0.78
atan
-0.71
keeping
-0.70
ovie
-0.69
xtap
-0.69
Fram
-0.67
idated
-0.67
DragonMagazine
-0.66
ãĥ¼ãĥĨ
-0.66
OCK
-0.64
POSITIVE LOGITS
importantly
1.24
than
1.08
ado
1.07
Than
1.07
important
0.79
Likely
0.78
interesting
0.76
sophisticated
0.76
troubling
0.73
worrisome
0.73
Activations Density 0.067%