INDEX
Explanations
words related to conflict or opposition, typically involving adversarial relationships
New Auto-Interp
Negative Logits
sospe
-0.44
mengangg
-0.41
cámara
-0.38
setVerticalGroup
-0.38
agres
-0.36
recours
-0.35
racc
-0.35
fidélité
-0.34
coupable
-0.34
artificiales
-0.34
POSITIVE LOGITS
defeat
0.81
vanqu
0.77
annihil
0.74
decim
0.73
incapac
0.69
overthrow
0.69
dismantling
0.69
dismantle
0.69
defeating
0.68
annihilated
0.68
Activations Density 0.534%