INDEX
Explanations
an indication of preference or approval
instances of the phrase "if you like."
New Auto-Interp
Negative Logits
inas
-0.80
arta
-0.75
bourne
-0.75
esi
-0.73
ells
-0.72
Dispatch
-0.70
doi
-0.70
anthrop
-0.69
inion
-0.69
chin
-0.68
POSITIVE LOGITS
lihood
1.43
lier
1.01
liest
1.00
ably
0.88
liness
0.84
minded
0.80
ours
0.76
glers
0.75
76561
0.69
terday
0.67
Activations Density 0.066%