INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Thirteen
    -0.90
     thirteen
    -0.85
     Twelve
    -0.82
     Eleven
    -0.82
     twelve
    -0.82
    Thirteen
    -0.79
     Julie
    -0.79
    Twelve
    -0.74
     eleven
    -0.73
    Julie
    -0.73
    POSITIVE LOGITS
    enda
    0.72
     canning
    0.71
    eton
    0.68
    ENDA
    0.68
    тті
    0.66
    0.66
    Tung
    0.65
     CHA
    0.65
     стомато
    0.64
     Zach
    0.63
    Act Density 0.310%

    No Known Activations