INDEX
Explanations
strong emotional or impactful terms related to various contexts
references to statistical data and its implications
New Auto-Interp
Negative Logits
NetMessage
-0.82
entimes
-0.75
ãĤ¼ãĤ¦ãĤ¹
-0.75
Ă
-0.74
ãĤ¦ãĤ¹
-0.74
ġ
-0.73
RandomRedditor
-0.73
ü
-0.73
÷
-0.73
Ě
-0.73
POSITIVE LOGITS
afterwards
0.68
ITV
0.63
,
0.61
Anyway
0.61
Fifa
0.56
Spielberg
0.56
though
0.56
because
0.55
later
0.55
Simon
0.55
Activations Density 1.274%