INDEX
Explanations
proper nouns or names used for comparisons
phrases involving the word "likes" and its context
New Auto-Interp
Negative Logits
INTON
-0.63
ISH
-0.62
士
-0.62
Procedures
-0.61
Donation
-0.59
Sources
-0.59
sources
-0.56
cas
-0.56
NING
-0.55
Springs
-0.55
POSITIVE LOGITS
liest
1.14
lihood
1.06
hots
0.81
liness
0.78
creen
0.77
lier
0.77
wikipedia
0.74
paces
0.72
of
0.70
minded
0.69
Activations Density 0.027%