INDEX
Explanations
mentions of different types of alcohol, particularly wine
mentions of wine
New Auto-Interp
Negative Logits
ilitary
-0.79
ulation
-0.76
aneous
-0.73
aneously
-0.71
ospace
-0.71
Occupations
-0.67
ULAR
-0.66
Bangl
-0.65
Tur
-0.64
WATCHED
-0.64
POSITIVE LOGITS
grapes
1.26
tasting
1.23
cellar
1.23
vinegar
1.23
wine
1.13
wine
1.11
tast
1.04
bottles
1.02
grape
1.02
wines
0.98
Activations Density 0.033%