INDEX
Explanations
attributes associated with ethical and moral values
commas in lists of ideas or concepts
New Auto-Interp
Negative Logits
riber
-0.71
cients
-0.71
cko
-0.70
acent
-0.69
quet
-0.69
upiter
-0.68
utor
-0.68
ported
-0.67
geon
-0.67
assed
-0.67
POSITIVE LOGITS
creativity
1.03
albeit
0.99
heroism
0.94
humility
0.92
degradation
0.92
alienation
0.89
intolerance
0.88
tolerance
0.87
paranoia
0.86
bravery
0.85
Activations Density 0.355%