INDEX
Explanations
instances of the word "like" and its variations
New Auto-Interp
Negative Logits
ights
-0.15
rib
-0.15
Æ°á»Ľi
-0.15
like
-0.15
_COMPAT
-0.14
ÙĬج
-0.14
_DECLARE
-0.14
dy
-0.14
bart
-0.14
icamente
-0.14
POSITIVE LOGITS
-minded
0.35
minded
0.33
WISE
0.28
unto
0.25
ewise
0.22
hood
0.22
able
0.20
-wise
0.20
Minds
0.19
elihood
0.18
Activations Density 0.034%