INDEX
Explanations
phrases related to drinking and containers such as cups and jars
words and phrases related to drinking and consumption
New Auto-Interp
Negative Logits
invari
-0.70
architecture
-0.70
canonical
-0.69
arche
-0.68
backward
-0.66
incumb
-0.66
symmetry
-0.65
ĺħ
-0.65
graceful
-0.64
pioneer
-0.63
POSITIVE LOGITS
bottles
1.42
bottle
1.29
cans
1.10
cohol
1.02
liquor
1.00
soda
0.99
Cola
0.99
cig
0.98
whiskey
0.96
vodka
0.95
Activations Density 0.291%