INDEX
Explanations
names of political figures or organizations
references to former officials and their positions
New Auto-Interp
Negative Logits
itivity
-0.69
ickle
-0.69
shapeshifter
-0.68
ptoms
-0.66
ignant
-0.66
ebook
-0.65
minecraft
-0.64
ankind
-0.64
oeuv
-0.62
itars
-0.61
POSITIVE LOGITS
£
0.78
ģĸ
0.77
operative
0.75
¢
0.74
stal
0.74
į
0.72
oslov
0.68
ãĤ¶
0.68
§
0.68
Olympia
0.68
Activations Density 0.167%