INDEX
    Explanations

    items described with adjectives

    New Auto-Interp
    Negative Logits
    тре
    0.54
     milhões
    0.52
     аны
    0.48
    ру
    0.47
    }$
    0.47
    рен
    0.46
    лен
    0.45
    0.45
     obligado
    0.45
    тическая
    0.44
    POSITIVE LOGITS
    four
    0.53
     eatery
    0.49
    pointing
    0.48
    goal
    0.47
    inthe
    0.46
    0.45
    sp
    0.45
    ARM
    0.45
    fight
    0.44
    Synchron
    0.44
    Act Density 0.001%

    No Known Activations