INDEX
Explanations
phrases related to critiques or negative commentary
references to quantity or amounts
New Auto-Interp
Negative Logits
ĸļ
-0.84
agre
-0.82
condem
-0.70
dayName
-0.69
oler
-0.68
depot
-0.67
disposed
-0.65
acus
-0.64
posal
-0.63
ende
-0.62
POSITIVE LOGITS
arching
0.83
things
0.64
us
0.63
icial
0.63
legged
0.62
unlucky
0.62
tan
0.61
whom
0.61
lesser
0.58
outed
0.57
Activations Density 0.104%