INDEX
Explanations
references to tragic events or individuals
references to victims and historical narratives surrounding them
New Auto-Interp
Negative Logits
Leilan
-0.64
flows
-0.63
constraints
-0.62
stamp
-0.60
Advent
-0.59
Panther
-0.58
Hath
-0.57
waivers
-0.57
backer
-0.56
PRE
-0.56
POSITIVE LOGITS
orian
2.03
ory
1.86
orians
1.85
orical
1.80
oria
1.67
orious
1.67
orically
1.64
oire
1.63
ories
1.63
oric
1.63
Activations Density 0.067%