INDEX
Explanations
adjectives or adverbs describing actions or qualities in a negative context
instances of "large" creatures in descriptions, particularly related to Pokémon
New Auto-Interp
Negative Logits
anders
-0.70
ategory
-0.67
conservancy
-0.61
anners
-0.60
ulhu
-0.59
annis
-0.59
illon
-0.59
REL
-0.57
aterial
-0.56
mosqu
-0.56
POSITIVE LOGITS
ly
3.50
LY
2.36
fully
1.57
lys
1.57
lies
1.49
edly
1.48
liness
1.46
ELY
1.41
ously
1.36
lly
1.36
Activations Density 0.408%