INDEX
Explanations
phrases related to personal grievances and legal disputes
New Auto-Interp
Negative Logits
опÑĢед
-0.17
wil
-0.15
lez
-0.15
urovision
-0.14
Recovered
-0.14
teÅŁ
-0.14
ngth
-0.13
isse
-0.13
unday
-0.13
Äı
-0.13
POSITIVE LOGITS
innoc
0.20
innocence
0.20
innocent
0.20
Innoc
0.17
nowhere
0.16
deserve
0.16
na
0.15
harmless
0.15
trusted
0.15
iec
0.15
Activations Density 0.267%