INDEX
Explanations
mentions of cafes or coffee shops
New Auto-Interp
Negative Logits
nat
-0.17
nan
-0.16
hyp
-0.15
Guerrero
-0.15
usted
-0.14
trav
-0.14
ob
-0.14
lashes
-0.14
istrovstvÃŃ
-0.13
advanced
-0.13
POSITIVE LOGITS
anova
0.17
arian
0.16
presso
0.16
á»ĭ
0.15
-thumb
0.14
Highlander
0.14
itmap
0.14
ivate
0.14
ritte
0.14
Morm
0.14
Activations Density 0.007%