INDEX
Explanations
news headlines with urgent calls to action
urgent requests or recommendations for viewing content
New Auto-Interp
Negative Logits
©¶æ
-0.72
Cantor
-0.71
fortun
-0.70
Schwar
-0.68
Phant
-0.65
rall
-0.65
Mig
-0.64
Bri
-0.64
Lauder
-0.63
Pilgrim
-0.63
POSITIVE LOGITS
WATCH
1.24
ARD
0.83
ANG
0.76
ENS
0.72
AUD
0.72
INAL
0.71
IFF
0.70
ached
0.70
LECT
0.69
OSE
0.69
Activations Density 0.004%