INDEX
Explanations
references to flu and related symptoms
New Auto-Interp
Negative Logits
rrha
-0.18
ức
-0.17
lesi
-0.16
isÃŃ
-0.15
leans
-0.15
aleza
-0.15
PRI
-0.15
ped
-0.15
iflower
-0.15
letal
-0.15
POSITIVE LOGITS
shot
0.30
oxetine
0.30
oro
0.29
-shot
0.27
shots
0.26
shot
0.24
Shot
0.24
ency
0.24
_shot
0.23
vox
0.23
Activations Density 0.003%