INDEX
Explanations
conflicts and inconsistencies in values and beliefs
New Auto-Interp
Negative Logits
Complaint
-0.15
resembl
-0.15
ÅĻiv
-0.14
ús
-0.14
EqualityComparer
-0.14
islav
-0.14
VÅ¡
-0.13
reta
-0.13
ÏĦή
-0.13
ustin
-0.13
POSITIVE LOGITS
conflict
0.55
conflicts
0.49
contrary
0.42
Conflict
0.42
conflicting
0.41
clash
0.41
contr
0.40
conflic
0.39
Contr
0.39
CONTR
0.39
Activations Density 0.294%