INDEX
Explanations
espionage and secret information
New Auto-Interp
Negative Logits
↵
1.37
’
1.21
ovvero
1.20
ism
1.08
r
1.08
hubo
1.07
K
1.05
werd
1.03
’;
1.02
s
1.00
POSITIVE LOGITS
اب
1.27
ИС
1.16
десят
1.14
НИЕ
1.10
ERA
1.09
pras
1.08
HING
1.08
په
1.08
КС
1.05
Thriller
1.05
Activations Density 0.359%