INDEX
Explanations
phrases related to disciplinary actions
terms related to discipline and its associated consequences
New Auto-Interp
Negative Logits
izen
-0.76
ileaks
-0.76
-0.72
ipeg
-0.69
stanbul
-0.69
ophe
-0.68
sworth
-0.68
endor
-0.68
YP
-0.68
ocent
-0.67
POSITIVE LOGITS
discipline
0.87
discipl
0.81
disciplinary
0.79
disciplined
0.78
srfAttach
0.74
tant
0.73
estinal
0.72
disciplines
0.70
Malfoy
0.70
iple
0.70
Activations Density 0.050%