INDEX
Explanations
The neuron activates on mentions of cafés (cafe/café) and related coffee-shop terms.
New Auto-Interp
Negative Logits
sağlayan
-0.07
bindings
-0.07
rooted
-0.07
鸟
-0.06
výkon
-0.06
Loop
-0.06
Hook
-0.06
Rhino
-0.06
groin
-0.06
Strong
-0.06
POSITIVE LOGITS
Café
0.14
Cafe
0.13
cafe
0.12
café
0.12
cafes
0.09
Caf
0.09
..........
0.07
Kaf
0.07
afe
0.07
refrigerator
0.07
Activations Density 0.006%