INDEX
    Explanations

    the word "this" and its variations

    New Auto-Interp
    Negative Logits
     unfortunate
    -0.44
     important
    -0.40
     infamous
    -0.37
     part
    -0.37
     pesky
    -0.35
    listItem
    -0.35
    ponym
    -0.35
    kw
    -0.35
     crime
    -0.34
     unlucky
    -0.33
    POSITIVE LOGITS
     approach
    0.95
     abordagem
    0.90
     pendekatan
    0.86
    approach
    0.84
     approche
    0.82
     APPROACH
    0.79
     enfoque
    0.77
    Approach
    0.76
     způ
    0.72
     način
    0.70
    Act Density 0.096%

    No Known Activations