INDEX
Explanations
terms related to bullying
references to bullying and its related terms
New Auto-Interp
Negative Logits
Num
-0.76
Num
-0.74
lime
-0.72
ources
-0.70
apache
-0.70
Lumin
-0.69
Recon
-0.69
Hal
-0.68
Interior
-0.67
Illum
-0.67
POSITIVE LOGITS
bullying
3.31
bullies
3.12
bully
3.04
bullied
2.94
bull
1.52
blackmail
1.40
classmate
1.35
Bull
1.25
homophobic
1.24
trolls
1.24
Activations Density 0.023%