INDEX
Explanations
phrases related to activities or tasks being completed
negative indicators or descriptions related to individuals or situations
New Auto-Interp
Negative Logits
Dickinson
-0.73
ulhu
-0.68
FY
-0.66
incub
-0.64
corros
-0.63
floppy
-0.62
modem
-0.62
undergrad
-0.61
repl
-0.60
improv
-0.60
POSITIVE LOGITS
san
1.27
sama
1.27
based
1.24
style
1.11
Lago
1.10
themed
1.07
derived
1.06
winning
1.06
kun
1.06
sized
1.05
Activations Density 0.138%