INDEX
Explanations
mentions of cafes
mentions of cafes
New Auto-Interp
Negative Logits
PATH
-0.70
iod
-0.69
sburg
-0.65
methyl
-0.62
RAY
-0.62
ansom
-0.61
kus
-0.60
INGS
-0.58
ologically
-0.58
female
-0.58
POSITIVE LOGITS
cafe
1.16
cafes
1.14
café
1.05
Cafe
0.99
caf
0.97
racer
0.90
eteria
0.89
Café
0.87
kios
0.86
ecake
0.85
Activations Density 0.009%