INDEX
Explanations
content related to international politics and global conflicts
New Auto-Interp
Negative Logits
lies
-0.74
terday
-0.72
âĦ¢:
-0.70
Awakens
-0.70
Relief
-0.68
Bars
-0.68
rones
-0.67
izable
-0.67
Needs
-0.64
rawdownloadcloneembedreportprint
-0.63
POSITIVE LOGITS
criticized
1.08
likened
1.08
avering
1.01
criticised
0.99
praised
0.97
subjected
0.95
hailed
0.94
touted
0.93
plagued
0.93
extensively
0.92
Activations Density 0.149%