INDEX
    Explanations

    first in order or sequence

    New Auto-Interp
    Negative Logits
     basic
    0.77
     Beginning
    0.76
    Beginning
    0.75
    basic
    0.72
    Basic
    0.72
    roots
    0.71
    ប្រ
    0.71
    基本的な
    0.71
     उत्साह
    0.70
    の中
    0.68
    POSITIVE LOGITS
     먼저
    2.85
     first
    2.77
    先に
    2.52
    first
    2.48
     zuerst
    2.47
     primero
    2.47
     FIRST
    2.32
     først
    2.32
    首先
    2.24
    First
    2.19
    Act Density 0.243%

    No Known Activations