INDEX
Explanations
mentions of the word 'alcohol' at varying intensities
references to alcohol and its implications
New Auto-Interp
Negative Logits
arity
-0.75
Dear
-0.70
Telecom
-0.70
Dear
-0.69
Kaw
-0.68
Postal
-0.67
NCT
-0.66
VIEW
-0.66
Pon
-0.66
Goff
-0.64
POSITIVE LOGITS
alcohol
1.18
cohol
1.14
beverages
1.06
drinkers
1.04
alcohol
1.04
ocaust
1.03
beverage
1.02
dehyd
0.99
liquor
0.98
poisoning
0.96
Activations Density 0.010%