INDEX
    Explanations

    references to individuals and their roles in a specified context

    New Auto-Interp
    Negative Logits
    alat
    -0.17
     premise
    -0.16
    isd
    -0.15
    awner
    -0.15
    awning
    -0.15
    urve
    -0.15
     Prem
    -0.15
    vection
    -0.14
    opes
    -0.14
    aws
    -0.14
    POSITIVE LOGITS
    oter
    0.16
    fusc
    0.15
    sea
    0.15
    PÅĻÃŃ
    0.14
     Hubbard
    0.14
    ILT
    0.14
     Fritz
    0.14
    bara
    0.14
     disen
    0.13
    Ø«ÛĮر
    0.13
    Act Density 0.675%

    No Known Activations