INDEX
Explanations
mentions of criminal or morally reprehensible activities
words and phrases related to legal and social violations
New Auto-Interp
Negative Logits
UNCLASSIFIED
-0.85
anwhile
-0.80
pse
-0.72
)."
-0.69
Azerb
-0.69
.).
-0.69
é¾įå
-0.66
âķIJ
-0.65
âĹ¼
-0.62
thereafter
-0.61
POSITIVE LOGITS
Belfast
0.81
âĢº
0.74
âĢİ
0.62
âĢİ
0.59
Copyright
0.56
Posted
0.56
Skip
0.54
cryptocurrency
0.53
Donald
0.53
lately
0.52
Activations Density 1.853%