INDEX
Explanations
instances of the word "like" used as a comparison or to introduce similarities
New Auto-Interp
Negative Logits
åłĤ
-0.17
aps
-0.15
iman
-0.15
obia
-0.15
iah
-0.14
illin
-0.14
óg
-0.14
ropol
-0.14
either
-0.14
odore
-0.14
POSITIVE LOGITS
many
0.32
many
0.26
any
0.26
许å¤ļ
0.23
most
0.22
muchos
0.20
MANY
0.20
everything
0.20
WISE
0.19
ä»»ä½ķ
0.19
Activations Density 0.056%