INDEX
Explanations
proper nouns
names and references to individuals, particularly in a statistical or contextual frame
New Auto-Interp
Negative Logits
pection
-0.79
λ
-0.68
NetMessage
-0.68
MAR
-0.67
owship
-0.67
nir
-0.67
MET
-0.67
hesda
-0.67
hiba
-0.65
ancial
-0.64
POSITIVE LOGITS
anski
0.80
opa
0.72
haus
0.72
phe
0.67
illions
0.65
jug
0.61
Balk
0.60
ucl
0.60
atican
0.60
oli
0.58
Activations Density 0.247%