INDEX
Explanations
references to beer and related beverages
New Auto-Interp
Negative Logits
ean
-0.17
ting
-0.17
al
-0.16
ted
-0.16
tt
-0.15
Dise
-0.15
eh
-0.15
RN
-0.15
aan
-0.15
eo
-0.15
POSITIVE LOGITS
pong
0.29
bower
0.20
pong
0.20
gardens
0.19
adv
0.19
garden
0.19
goggles
0.18
taps
0.18
stagram
0.18
/gin
0.17
Activations Density 0.029%