INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    6
    0.45
    ibs
    0.39
    4
    0.39
    5
    0.38
    bit
    0.37
    product
    0.37
    环境下
    0.35
    3
    0.35
    ib
    0.35
    0.35
    POSITIVE LOGITS
     esterno
    0.47
    ו
    0.47
    ্ৰ
    0.44
     yeni
    0.43
    0.43
    ক্লা
    0.42
     carbono
    0.42
     febbraio
    0.41
    🏚
    0.41
     enero
    0.41
    Act Density 0.043%

    No Known Activations