INDEX
Explanations
mentions of drinks and cocktails
references to cocktails and mixed drinks
New Auto-Interp
Negative Logits
66666666
-0.80
Parenthood
-0.79
founded
-0.75
debian
-0.74
ledged
-0.74
Merit
-0.68
orate
-0.67
shr
-0.67
thritis
-0.67
Ther
-0.67
POSITIVE LOGITS
cocktail
1.25
cocktails
1.05
conco
0.99
waitress
0.97
drinks
0.91
drink
0.83
soda
0.83
lounge
0.80
latt
0.80
syrup
0.79
Activations Density 0.012%