INDEX
Explanations
phrases indicating potential actions, recommendations, or considerations
instances of the phrase "we" and its variants indicating collective thoughts or actions
New Auto-Interp
Negative Logits
amaz
-0.62
Rank
-0.60
dylib
-0.58
bats
-0.58
rams
-0.56
shows
-0.54
Cheong
-0.54
satell
-0.54
WTC
-0.54
Ups
-0.53
POSITIVE LOGITS
sorely
1.13
dearly
0.90
aspire
0.87
gladly
0.86
dreamed
0.85
envy
0.81
wont
0.79
desperately
0.79
vehemently
0.78
hotly
0.78
Activations Density 0.155%