INDEX
Explanations
references to historical genocides and related events
New Auto-Interp
Negative Logits
gradable
-0.15
iae
-0.15
Sever
-0.14
Grimm
-0.14
ì°°
-0.14
istrat
-0.14
acia
-0.14
[::-
-0.14
Gra
-0.13
UES
-0.13
POSITIVE LOGITS
gen
0.31
genocide
0.29
ocide
0.28
/gen
0.26
Gen
0.25
gén
0.24
mass
0.23
-gen
0.21
Gen
0.21
exter
0.20
Activations Density 0.167%