INDEX
Explanations
phrases related to legal complaints and documents
negative statements or phrases that express a lack of validation or certainty
New Auto-Interp
Negative Logits
knows
-0.66
keeps
-0.64
waits
-0.64
likes
-0.63
hopping
-0.63
loves
-0.62
dying
-0.62
TPPStreamerBot
-0.60
invincible
-0.60
sleeps
-0.59
POSITIVE LOGITS
involve
1.66
relate
1.34
entail
1.30
originate
1.29
contain
1.25
consist
1.24
hinge
1.23
encompass
1.23
reflect
1.20
stem
1.17
Activations Density 0.368%