INDEX
Explanations
names of people and organizations
New Auto-Interp
Negative Logits
LookAnd
-0.86
autorytatywna
-0.74
kasarigan
-0.67
الاطلاع
-0.65
rungsseite
-0.65
beginnetje
-0.64
fjspx
-0.63
оригіналу
-0.62
ujednoznacz
-0.62
препратки
-0.61
POSITIVE LOGITS
leaſt
0.67
houſe
0.64
pleaſure
0.63
neſs
0.63
ftances
0.61
ſaid
0.58
uſed
0.56
ſelf
0.56
ſmall
0.55
ſhould
0.55
Activations Density 3.883%