INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    :
    0.61
    diamonds
    0.48
     andar
    0.47
     reale
    0.47
    gunaan
    0.47
     eran
    0.46
     andare
    0.45
     و
    0.45
     uso
    0.45
    नम
    0.44
    POSITIVE LOGITS
    on
    0.55
     January
    0.46
     February
    0.45
     নেতাকর্মীরা
    0.44
    ón
    0.44
    et
    0.44
     Podczas
    0.43
     Quando
    0.43
     Lorsque
    0.42
     Když
    0.42
    Act Density 0.006%

    No Known Activations