INDEX
Explanations
references to alcoholic beverages
New Auto-Interp
Negative Logits
vitamina
-0.56
Responder
-0.54
Millard
-0.54
loss
-0.53
atasaray
-0.51
bolsa
-0.49
zuge
-0.48
كذا
-0.48
solare
-0.48
nido
-0.47
POSITIVE LOGITS
Beer
1.62
beer
1.60
Beer
1.53
BEER
1.49
beer
1.44
beers
1.35
wine
1.32
brewery
1.30
Beers
1.28
tequila
1.27
Activations Density 2.417%