INDEX
    Explanations

    references to specific categories or characteristics

    New Auto-Interp
    Negative Logits
    Économie
    -0.57
     rağmen
    -0.56
     normaux
    -0.56
     llorar
    -0.55
     respectivas
    -0.54
     toekomst
    -0.52
     supérieurs
    -0.51
     perfeita
    -0.51
     économies
    -0.49
     devriez
    -0.49
    POSITIVE LOGITS
     kinds
    1.00
     specific
    0.96
     types
    0.93
     amount
    0.93
     particular
    0.91
     kind
    0.88
    kinds
    0.79
    特定
    0.79
     circumstances
    0.78
    particular
    0.78
    Act Density 0.135%

    No Known Activations