INDEX
Explanations
symbols or formatting elements
instances of symbols or representations of political or social movements
New Auto-Interp
Negative Logits
Manz
-0.64
dod
-0.64
Jeanne
-0.64
Izan
-0.64
Bris
-0.62
shroud
-0.61
sters
-0.61
Shelter
-0.61
ABE
-0.61
sacrific
-0.61
POSITIVE LOGITS
ª
1.45
Ĵ
1.39
IJ
1.33
«
1.24
¹
1.20
ı
1.20
ĸ
1.18
³
1.17
ij
1.17
Ķ
1.17
Activations Density 0.090%