INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.75
     мәкал
    -0.71
    Попис
    -0.71
     незавершена
    -0.70
     Numerade
    -0.69
     Мексичка
    -0.63
    oa̍t
    -0.62
    Географиясе
    -0.62
    :✨
    -0.61
    المناصب
    -0.60
    POSITIVE LOGITS
    Text
    0.60
     text
    0.54
     Text
    0.49
    text
    0.48
    Word
    0.39
     текст
    0.39
    0.35
     building
    0.34
     word
    0.34
     chữ
    0.34
    Act Density 0.001%

    No Known Activations