INDEX
Explanations
user mentions in social media or online platforms
references to blocking or unlocking digital content or accounts
New Auto-Interp
Negative Logits
glim
-0.75
toget
-0.71
handc
-0.70
tti
-0.67
quarters
-0.65
Dh
-0.63
liner
-0.61
nutshell
-0.60
ppe
-0.60
Frie
-0.59
POSITIVE LOGITS
able
1.49
ables
1.42
ABLE
1.22
ability
1.21
ment
1.12
abilities
1.00
ers
0.96
edIn
0.95
ible
0.94
er
0.93
Activations Density 0.111%