INDEX
Explanations
the word "but" in various contexts
New Auto-Interp
Negative Logits
ehr
-0.17
ipop
-0.16
dee
-0.15
chứ
-0.15
Guerrero
-0.14
ä¸ĬãģĮ
-0.14
osc
-0.14
iteur
-0.14
ãģ¦ãĤĤ
-0.14
assy
-0.13
POSITIVE LOGITS
nice
0.16
ãĤ¹ãĤ«
0.15
nice
0.14
apparently
0.14
basically
0.14
briefly
0.14
maybe
0.14
Suff
0.14
nic
0.13
711
0.13
Activations Density 0.133%