INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     cient
    -0.49
    adalajara
    -0.41
    Javier
    -0.41
    uffy
    -0.40
     Javier
    -0.39
     Machiavelli
    -0.39
     laci
    -0.38
     Acapulco
    -0.37
    いけない
    -0.37
     Handel
    -0.36
    POSITIVE LOGITS
     Rose
    2.17
    Rose
    2.13
     rose
    1.91
     ROSE
    1.91
    rose
    1.78
    ROSE
    1.70
     Roses
    1.39
     roses
    1.32
    Roses
    1.10
    roses
    1.09
    Act Density 0.003%

    No Known Activations