INDEX
    Explanations

    instances of repetition or similarity in phrases or descriptions

    words associated with correctness or appropriateness in various contexts

    New Auto-Interp
    Negative Logits
    ļéĨĴ
    -0.81
    antam
    -0.73
    ntil
    -0.69
    ADRA
    -0.68
    ibal
    -0.68
    quished
    -0.67
    ij
    -0.64
    â̦â̦â̦â̦â̦â̦â̦â̦
    -0.63
    Introdu
    -0.63
    Minimum
    -0.62
    POSITIVE LOGITS
     money
    0.75
     smells
    0.70
     messenger
    0.70
     manner
    0.69
     livelihood
    0.68
     timetable
    0.68
     histories
    0.68
     manners
    0.68
     colors
    0.67
     geography
    0.66
    Act Density 0.717%

    No Known Activations