INDEX
Explanations
instances of the word "you" indicating direct address to the audience
New Auto-Interp
Negative Logits
admitting
-0.66
thinks
-0.63
wors
-0.63
agra
-0.60
considers
-0.60
Roh
-0.59
版
-0.59
ˈ
-0.59
obbies
-0.59
forcement
-0.59
POSITIVE LOGITS
worth
0.83
MUST
0.65
benefit
0.65
Tube
0.64
vill
0.60
tle
0.59
worldly
0.58
vc
0.58
CAN
0.57
ittal
0.57
Activations Density 0.092%