INDEX
    Explanations

    specific entities or numbers

    New Auto-Interp
    Negative Logits
    0.46
     Kapoor
    0.44
    0.44
     við
    0.43
     seseorang
    0.42
    sailboat
    0.41
     kredit
    0.41
     chcesz
    0.40
     ukaz
    0.40
     Machinist
    0.40
    POSITIVE LOGITS
    ale
    0.51
    p
    0.45
    w
    0.43
    their
    0.43
     ус
    0.42
    tained
    0.42
    efa
    0.42
    rem
    0.41
    S
    0.41
    ée
    0.41
    Act Density 0.012%

    No Known Activations