INDEX
Explanations
the word "fans"
references to fans and their importance or involvement
New Auto-Interp
Negative Logits
nom
-0.70
srfAttach
-0.66
lished
-0.65
olon
-0.64
ateral
-0.62
ãģĨ
-0.62
Prosecut
-0.62
Surgery
-0.62
EDIT
-0.61
vet
-0.61
POSITIVE LOGITS
haw
0.93
atics
0.93
atical
0.88
ubs
0.87
boo
0.85
cheering
0.84
Atmosp
0.84
verson
0.83
atti
0.82
boys
0.79
Activations Density 0.030%