INDEX
    Explanations

    words indicating size or importance

    New Auto-Interp
    Negative Logits
     referenties
    -0.74
    InitVars
    -0.73
     itſelf
    -0.68
     autorytatywna
    -0.68
    (()
    -0.65
    DataAnnotations
    -0.65
    सनीय
    -0.62
    Wonderful
    -0.61
     whofe
    -0.60
    BASELINE
    -0.58
    POSITIVE LOGITS
     più
    1.20
     piu
    1.01
     más
    1.00
     Più
    0.98
     máis
    0.94
    più
    0.93
     més
    0.91
     lebih
    0.91
     MÁS
    0.90
     πιο
    0.86
    Act Density 0.070%

    No Known Activations