INDEX
Explanations
specific instances of café and decorative terms in the text
New Auto-Interp
Negative Logits
uyla
-0.17
ı
-0.14
BST
-0.14
ace
-0.13
iest
-0.13
iano
-0.13
inho
-0.13
ously
-0.13
rell
-0.13
ä¸ĢåĮº
-0.13
POSITIVE LOGITS
ï¸ı
0.40
ï¸
0.29
elsius
0.23
erif
0.20
âĨĴâĨĴ
0.19
teborg
0.19
ever
0.18
dür
0.18
Ù«
0.18
lico
0.18
Activations Density 0.165%