INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    selves
    -0.74
     Rout
    -0.69
    purpose
    -0.68
     expedition
    -0.66
    robe
    -0.64
     Project
    -0.61
     homeland
    -0.61
     scholarships
    -0.60
     classmates
    -0.60
     classmate
    -0.60
    POSITIVE LOGITS
    ariat
    0.67
    0.64
     spawned
    0.64
    orum
    0.64
    alla
    0.63
    0.63
    imen
    0.62
     Kelvin
    0.62
    idity
    0.62
    ylan
    0.62
    Act Density 0.072%

    No Known Activations