INDEX
Explanations
references to military presence and operations
New Auto-Interp
Negative Logits
Johns
-0.15
تÙĦ
-0.14
Owens
-0.14
æ©
-0.14
олиÑĤ
-0.14
fü
-0.14
abbit
-0.13
Flem
-0.13
-buffer
-0.13
aza
-0.13
POSITIVE LOGITS
Tu
0.23
Pants
0.20
Mi
0.20
Tu
0.20
Tup
0.19
Su
0.17
Mi
0.17
Cosmos
0.16
Suk
0.16
ãĤĤãģ£ãģ¨
0.16
Activations Density 0.021%