INDEX
    Explanations

    the presence of curly braces or brackets in the text

    New Auto-Interp
    Negative Logits
    ので
    -0.55
    -
    -0.53
     μέ
    -0.53
    ETRIC
    -0.52
     BorderRadius
    -0.50
    yar
    -0.48
     AT
    -0.47
     soát
    -0.47
     bro
    -0.46
     tat
    -0.45
    POSITIVE LOGITS
    Disliked
    0.95
    0.92
    ""}
    0.91
     cauſe
    0.89
     reaſon
    0.89
     purpoſe
    0.89
     themſelves
    0.88
     pleaſure
    0.87
    ||}
    0.86
     myſelf
    0.86
    Act Density 0.695%

    No Known Activations