INDEX
Explanations
mentions related to being a fan
mentions of fandom or fan-related content
New Auto-Interp
Negative Logits
oslov
-0.76
xon
-0.72
bitters
-0.66
Labrador
-0.66
unfocusedRange
-0.65
innon
-0.65
terday
-0.64
Ness
-0.63
muddy
-0.63
Kob
-0.62
POSITIVE LOGITS
Fan
1.23
Fan
1.12
igans
1.10
club
1.02
uci
0.88
fare
0.85
atics
0.83
atical
0.83
igan
0.77
boys
0.77
Activations Density 0.008%