INDEX
Explanations
actions or events emphasizing significant outcomes or consequences
references to social and political issues
New Auto-Interp
Negative Logits
.",
-0.72
?",
-0.70
.?
-0.65
estine
-0.59
)",
-0.59
orsi
-0.58
.....
-0.57
oln
-0.56
ena
-0.56
eneg
-0.55
POSITIVE LOGITS
ãĥ³ãĤ¸
0.66
ãĥĩãĤ£
0.66
BuyableInstoreAndOnline
0.64
ãĥ¥
0.63
ãĥīãĥ©
0.56
æ©
0.54
ãĤ·ãĥ£
0.53
ãĥł
0.52
uously
0.51
ãĥIJ
0.51
Activations Density 0.922%