INDEX
    Explanations

    expressions of personal reflections or emotional experiences

    New Auto-Interp
    Negative Logits
     now
    -0.18
    now
    -0.17
    ags
    -0.16
    reds
    -0.15
    manuel
    -0.15
    ajar
    -0.15
    opot
    -0.14
     Krish
    -0.14
    onde
    -0.14
    rema
    -0.14
    POSITIVE LOGITS
     future
    0.25
    future
    0.22
     futuro
    0.20
     бÑĥдÑĥÑī
    0.20
     Future
    0.19
    Future
    0.18
    _future
    0.17
     Kaynak
    0.16
     майбÑĥÑĤ
    0.16
    æľªæĿ¥
    0.16
    Act Density 0.003%

    No Known Activations