INDEX
    Explanations

    instances of gratitude or appreciation expressed in the text

    New Auto-Interp
    Negative Logits
     spl
    -0.15
    лÑĥг
    -0.14
    utas
    -0.14
     зави
    -0.14
    ICLES
    -0.14
    istes
    -0.13
    ATIC
    -0.13
    TION
    -0.13
    isode
    -0.13
    ottes
    -0.13
    POSITIVE LOGITS
    it
    0.19
     there
    0.18
    anky
    0.17
    we
    0.16
     hence
    0.15
     this
    0.15
     they
    0.15
     it
    0.14
     Carly
    0.14
    there
    0.14
    Act Density 0.245%

    No Known Activations