INDEX
Explanations
references to bullying behaviors
terms related to bullying and its effects
New Auto-Interp
Negative Logits
uncture
-0.79
ateur
-0.78
cession
-0.77
cise
-0.76
Donation
-0.74
icrobial
-0.71
Ec
-0.70
ources
-0.68
ourn
-0.68
ODE
-0.68
POSITIVE LOGITS
bullies
1.14
bullying
1.04
bullied
1.00
bully
0.93
behav
0.92
ãħĭ
0.84
pul
0.79
girls
0.72
tactics
0.70
ãħĭãħĭ
0.67
Activations Density 0.016%