INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    yrinth
    -0.86
    romy
    -0.76
    IU
    -0.74
    EMS
    -0.69
    ipolar
    -0.67
    irth
    -0.67
    nuts
    -0.66
    ilib
    -0.63
     Safari
    -0.62
    icult
    -0.62
    POSITIVE LOGITS
    plates
    1.22
    plate
    1.09
    paces
    0.99
     names
    0.91
     aliases
    0.90
     NAME
    0.89
     name
    0.89
    akes
    0.81
     tag
    0.80
    names
    0.80
    Act Density 0.035%

    No Known Activations