INDEX
    Explanations

    words or phrases expressing admiration or negative sentiments towards individuals

    adoration, abhorrence, adoptive transfer

    New Auto-Interp
    Negative Logits
    Startup
    -0.47
     policia
    -0.44
     promp
    -0.42
    Result
    -0.42
     punt
    -0.42
     rheumat
    -0.42
     Marke
    -0.42
     definit
    -0.41
     Chemin
    -0.41
    FFIC
    -0.41
    POSITIVE LOGITS
     adore
    1.53
     adored
    1.45
     adoration
    1.13
     adoro
    1.05
    adore
    1.04
     ador
    0.86
     Adorable
    0.76
     älskar
    0.75
     adorable
    0.73
     paixão
    0.70
    Act Density 0.002%

    No Known Activations