INDEX
Explanations
significant events throughout history
references to the concept of history
New Auto-Interp
Negative Logits
*/(
-0.84
nets
-0.79
jab
-0.77
pload
-0.75
reau
-0.70
utan
-0.70
hap
-0.69
liga
-0.68
atum
-0.67
forward
-0.65
POSITIVE LOGITS
mankind
0.77
warfare
0.74
Warfare
0.71
sorts
0.70
course
0.70
Humanity
0.68
humankind
0.67
heroism
0.67
PROG
0.67
Belief
0.65
Activations Density 0.163%