INDEX
Explanations
pronouns and related phrases
pronouns and their usage in sentences
New Auto-Interp
Negative Logits
Reviewer
-0.62
Flavoring
-0.60
recent
-0.60
%%%%
-0.55
Have
-0.54
paren
-0.53
umph
-0.53
hiatus
-0.53
UGH
-0.52
Recently
-0.52
POSITIVE LOGITS
depended
1.52
mattered
1.51
tended
1.44
interacted
1.33
resembled
1.30
flowed
1.30
risked
1.27
differed
1.27
wore
1.24
interfered
1.23
Activations Density 0.661%