INDEX
Explanations
names or terms related to organizations or events
references to specific names or abbreviations, likely related to organizations or entities
New Auto-Interp
Negative Logits
*/(
-0.80
worthiness
-0.76
theless
-0.70
bread
-0.66
tons
-0.66
stage
-0.64
sburg
-0.64
lain
-0.64
ski
-0.62
swer
-0.61
POSITIVE LOGITS
ñ
1.16
ordan
1.09
agra
1.09
ña
1.08
abetes
1.08
pper
1.07
ablo
1.06
enna
1.04
orno
1.02
1.02
Activations Density 0.075%