INDEX
Explanations
words related to contradiction or opposing ideas
New Auto-Interp
Negative Logits
usercontent
-0.17
ÐĶив
-0.16
mun
-0.15
chua
-0.15
몬
-0.14
Div
-0.14
rosis
-0.14
ULONG
-0.14
ÅĻen
-0.14
ulong
-0.14
POSITIVE LOGITS
contr
0.22
CONTR
0.20
Contr
0.17
ictory
0.17
ax
0.16
aven
0.16
ional
0.16
ived
0.15
Contr
0.15
aires
0.15
Activations Density 0.017%