INDEX
    Explanations

    references to a specific individual or personal identifier

    New Auto-Interp
    Negative Logits
    stood
    -0.68
    stage
    -0.67
    managed
    -0.64
    lift
    -0.64
    locked
    -0.64
    ãĤ°
    -0.63
     tolerate
    -0.62
    ting
    -0.61
    chest
    -0.61
    lain
    -0.60
    POSITIVE LOGITS
    ences
    1.12
    emi
    0.98
    ppo
    0.94
    oglu
    0.92
    otti
    0.90
    encia
    0.89
    ardo
    0.89
    zona
    0.88
    pe
    0.88
    ère
    0.86
    Act Density 0.006%

    No Known Activations