INDEX
Explanations
personal pronouns and verbs indicating actions or intent
New Auto-Interp
Negative Logits
Consortium
-0.77
Eleven
-0.64
Binding
-0.64
Globe
-0.61
Amazing
-0.60
Standard
-0.60
Rousse
-0.59
Associated
-0.59
Anat
-0.59
Wide
-0.58
POSITIVE LOGITS
've
1.51
're
1.40
'd
1.34
'll
1.25
'm
1.09
knew
1.09
could
1.05
encount
1.05
can
1.03
forgot
0.99
Activations Density 1.772%