INDEX
    Explanations

    mentions of architectural features and spatial arrangements

    New Auto-Interp
    Negative Logits
    -has
    -0.15
    yster
    -0.15
    ibu
    -0.15
     olmadıģını
    -0.14
    ’Ñıз
    -0.14
     ÙĨدارد
    -0.14
     ÙĨد
    -0.14
    iÄħ
    -0.14
     hadn
    -0.14
    ä¸įä¼ļ
    -0.13
    POSITIVE LOGITS
     are
    0.53
    çļĦæĺ¯
    0.35
     were
    0.34
     lies
    0.32
    _are
    0.32
     is
    0.31
     there
    0.30
     lie
    0.30
     Are
    0.29
     estão
    0.29
    Act Density 0.255%

    No Known Activations