INDEX
Explanations
arguments concerning morality and ethical contradictions
New Auto-Interp
Negative Logits
oman
-0.18
contradictions
-0.18
plusplus
-0.17
511
-0.15
realities
-0.15
Reality
-0.14
reh
-0.13
perceptions
-0.13
reality
-0.13
ówn
-0.13
POSITIVE LOGITS
Straw
0.17
ês
0.16
Raw
0.16
Minimal
0.16
ør
0.15
Coord
0.15
ấp
0.15
Minimal
0.15
pace
0.14
defender
0.14
Activations Density 0.069%