INDEX
Explanations
people's names from interviews or news reports
proper nouns, particularly names and titles
New Auto-Interp
Negative Logits
edition
-0.57
stro
-0.56
Introduced
-0.56
due
-0.56
ushima
-0.55
astered
-0.54
Torrent
-0.53
orest
-0.53
ords
-0.52
earthqu
-0.52
POSITIVE LOGITS
whether
1.48
why
1.42
how
1.27
what
1.19
why
1.14
whether
1.13
if
1.12
WHY
1.05
what
1.04
questions
1.00
Activations Density 0.118%