INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    antro
    -0.15
    onest
    -0.15
    νο
    -0.15
    лаб
    -0.14
    港
    -0.14
    psilon
    -0.14
    apers
    -0.13
    aurant
    -0.13
    habi
    -0.13
    VL
    -0.13
    POSITIVE LOGITS
     Native
    0.47
    Native
    0.42
     native
    0.38
     Indigenous
    0.35
     indigenous
    0.33
     natives
    0.30
    native
    0.30
    .Native
    0.30
    /native
    0.30
    .native
    0.29
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.