INDEX
Explanations
words related to negative or extreme adjectives
prefixes and terms related to negative or problematic qualities
New Auto-Interp
Negative Logits
ulhu
-0.72
Reviewer
-0.68
Angels
-0.63
Blacks
-0.63
stones
-0.59
tops
-0.58
ecstasy
-0.58
Jackets
-0.58
ashes
-0.57
Balls
-0.56
POSITIVE LOGITS
cedented
0.99
achable
0.81
ivable
0.81
itable
0.79
iable
0.78
izable
0.78
isable
0.77
vised
0.75
ended
0.75
icable
0.73
Activations Density 0.101%