INDEX
    Explanations

    the presence of the word "you."

    New Auto-Interp
    Negative Logits
     itſelf
    -0.85
     Efq
    -0.84
     reaſon
    -0.74
     Reſ
    -0.73
     Diſ
    -0.68
    struktion
    -0.68
     Chriftian
    -0.67
     Jefus
    -0.67
     ThemeData
    -0.67
     (\<
    -0.67
    POSITIVE LOGITS
     را
    0.80
    音を
    0.75
    MENAFN
    0.73
    larını
    0.73
    線を
    0.72
    த்தை
    0.72
     को
    0.70
    ığını
    0.69
    いを
    0.69
    devamını
    0.69
    Act Density 0.046%

    No Known Activations