INDEX
Explanations
mentions of drinking water
mentions of drinking water or alcoholic beverages
New Auto-Interp
Negative Logits
eq
-0.77
orp
-0.74
debian
-0.70
elin
-0.69
roe
-0.67
Kinnikuman
-0.67
moving
-0.66
sure
-0.65
sha
-0.65
Egypt
-0.64
POSITIVE LOGITS
drinking
1.11
cohol
1.07
alcohol
1.02
beverage
1.00
beverages
0.99
drinkers
0.98
drink
0.95
drinks
0.93
Drink
0.90
whiskey
0.89
Activations Density 0.010%