INDEX
Explanations
the word "like."
the phrase "if you like" in various contexts
New Auto-Interp
Negative Logits
inas
-0.81
arta
-0.76
esi
-0.74
itta
-0.71
chin
-0.71
enthusi
-0.71
cci
-0.70
anthrop
-0.69
isman
-0.69
ospel
-0.67
POSITIVE LOGITS
lihood
1.43
lier
0.99
liest
0.94
ably
0.94
ours
0.80
liness
0.79
minded
0.78
yours
0.72
liking
0.71
76561
0.69
Activations Density 0.061%