INDEX
Explanations
words related to mistreatment or mismanagement
New Auto-Interp
Negative Logits
bomb
-0.18
ular
-0.16
gor
-0.16
olia
-0.15
)(((
-0.15
èĸ
-0.15
ieren
-0.15
alaria
-0.14
ôt
-0.14
emu
-0.14
POSITIVE LOGITS
ellaneous
0.20
ubishi
0.19
ohl
0.17
ustin
0.17
islav
0.16
303
0.15
ouched
0.15
odon
0.15
ocê
0.15
ufen
0.15
Activations Density 0.021%