INDEX
    Explanations

    temporal references and places

    New Auto-Interp
    Negative Logits
     Wich
    -0.15
     Discrim
    -0.14
    ration
    -0.14
    zure
    -0.14
    بس
    -0.13
    filer
    -0.13
     Crud
    -0.13
    Ø¢Ùħ
    -0.13
     pity
    -0.13
     EXPRESS
    -0.13
    POSITIVE LOGITS
     âĢł
    0.27
    âĢł
    0.22
    ÂĨ
    0.21
     gest
    0.21
     died
    0.16
    ãĥ¼ãĥĵ
    0.15
    ocab
    0.15
    oppins
    0.15
    yi
    0.14
     Hin
    0.14
    Act Density 0.010%

    No Known Activations