INDEX
Explanations
references to various types of dining establishments and restaurants
New Auto-Interp
Negative Logits
anders
-0.16
»
-0.16
lez
-0.15
ään
-0.14
ÛĮ
-0.14
sik
-0.14
rey
-0.14
ÏĦη
-0.14
unas
-0.14
erer
-0.14
POSITIVE LOGITS
/bar
0.32
chains
0.31
chain
0.30
/pub
0.29
-bars
0.28
-bar
0.26
Chains
0.25
/fast
0.24
-chain
0.24
_chain
0.24
Activations Density 0.022%