INDEX
Explanations
words related to opposition or conflict
references to opposing viewpoints or forces
New Auto-Interp
Negative Logits
onics
-0.74
seed
-0.70
ju
-0.69
Kev
-0.69
oner
-0.69
oled
-0.68
ikan
-0.67
ager
-0.66
del
-0.66
atche
-0.66
POSITIVE LOGITS
opposing
1.30
opponents
0.96
oppos
0.93
viewpoints
0.90
opponent
0.89
undermin
0.85
opposition
0.83
foes
0.82
sides
0.81
strugg
0.81
Activations Density 0.014%