INDEX
Explanations
names of individuals or places
New Auto-Interp
Negative Logits
DragonMagazine
-0.75
censored
-0.68
Xin
-0.66
Tibetan
-0.61
Korra
-0.61
heroin
-0.61
futures
-0.60
responsible
-0.59
BuyableInstoreAndOnline
-0.59
ENSE
-0.59
POSITIVE LOGITS
izophren
1.31
izoph
1.31
utz
1.24
umann
1.20
uler
1.17
mitt
1.14
acht
1.12
mid
1.12
afer
1.10
ulz
1.09
Activations Density 0.021%