INDEX
    Explanations

    numerical values and their associated formats or statistics in the text

    New Auto-Interp
    Negative Logits
     cauſe
    -0.83
     purpoſe
    -0.78
     ſtate
    -0.77
     pleaſure
    -0.72
     caufe
    -0.72
    theless
    -0.72
     समीक्षक
    -0.68
     houſe
    -0.68
     myſelf
    -0.68
     خارجية
    -0.67
    POSITIVE LOGITS
     making
    1.01
     being
    0.98
     giving
    0.94
     taking
    0.87
     using
    0.86
     having
    0.81
     getting
    0.80
     utilizing
    0.78
     bringing
    0.78
     running
    0.75
    Act Density 0.602%

    No Known Activations