INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     inf
    -0.65
     co
    -0.64
     conf
    -0.64
     inc
    -0.63
     transfer
    -0.62
     par
    -0.62
     resolve
    -0.61
     de
    -0.61
     del
    -0.61
     dep
    -0.61
    POSITIVE LOGITS
     madonna
    1.63
     swarovski
    1.53
     nephe
    1.51
     versace
    1.46
     murano
    1.45
     leonardo
    1.45
     affez
    1.44
     bourgeo
    1.42
     fluo
    1.41
    pection
    1.37
    Act Density 0.627%

    No Known Activations