INDEX
    Explanations

    names of people or characters

    New Auto-Interp
    Negative Logits
    ".
    -0.65
     inh
    -0.65
     appre
    -0.63
    !".
    -0.63
     elig
    -0.62
     Learns
    -0.62
    ").
    -0.62
     indo
    -0.61
     whereas
    -0.60
     ".
    -0.57
    POSITIVE LOGITS
     alike
    1.03
    axter
    0.88
    oliath
    0.80
     cohorts
    0.72
     colleagues
    0.67
    ossal
    0.66
    mates
    0.61
     others
    0.61
     are
    0.61
     teammate
    0.60
    Act Density 0.380%

    No Known Activations