INDEX
Explanations
contradictory statements and the concepts of necessity and value in societal contexts
New Auto-Interp
Negative Logits
Heard
-0.17
arrings
-0.17
Pix
-0.16
upe
-0.16
rema
-0.15
udad
-0.15
ague
-0.15
FML
-0.14
βά
-0.14
heard
-0.14
POSITIVE LOGITS
rel
0.75
Rel
0.41
sav
0.37
(rel
0.37
REL
0.37
-rel
0.35
Rel
0.35
rel
0.35
.rel
0.32
revel
0.31
Activations Density 0.282%