INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Gave
    0.65
    р
    0.63
     jewel
    0.61
     skyscrapers
    0.61
    EMENTS
    0.60
    רים
    0.60
     Jewels
    0.60
     Hearts
    0.59
     Fee
    0.59
     Semiconductors
    0.59
    POSITIVE LOGITS
    th
    0.68
    9
    0.66
     medida
    0.60
     disposição
    0.58
    job
    0.57
     बर्खास्त
    0.57
    5
    0.57
     complicado
    0.56
    8
    0.55
    خ
    0.55
    Act Density 0.002%

    No Known Activations