INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Carter
    -0.07
     Faction
    -0.07
     Idol
    -0.07
     Danger
    -0.07
     tsl
    -0.06
     narrator
    -0.06
    afa
    -0.06
     Festival
    -0.06
     centre
    -0.06
     dcc
    -0.06
    POSITIVE LOGITS
     imp
    0.06
     hypo
    0.06
     méd
    0.06
    creat
    0.06
     highlights
    0.06
    Custom
    0.06
     sodom
    0.06
    -generic
    0.06
    conut
    0.06
    BASEPATH
    0.05
    Act Density 0.002%

    No Known Activations