INDEX
    Explanations

    references to fandom or enthusiasm for various topics

    mentions of fans or enthusiasts

    New Auto-Interp
    Negative Logits
    ateral
    -0.71
    eneg
    -0.68
    akespe
    -0.66
    apeake
    -0.65
     Proceedings
    -0.65
     unfocusedRange
    -0.65
    giene
    -0.61
     Territ
    -0.61
     Nurs
    -0.60
     Lans
    -0.60
    POSITIVE LOGITS
    atical
    1.34
    atics
    1.15
    atically
    1.08
    boys
    0.99
    club
    0.95
    fare
    0.94
    atic
    0.92
    boy
    0.88
    igans
    0.87
    hetical
    0.84
    Act Density 0.020%

    No Known Activations