INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     USB
    -0.74
    onesa
    -0.73
     недвижи
    -0.71
     堀
    -0.71
    激情
    -0.71
    skaya
    -0.69
     القرار
    -0.68
     pudieran
    -0.68
    -0.68
    FACT
    -0.67
    POSITIVE LOGITS
     Fire
    1.16
    Fire
    1.06
     fire
    0.95
     FIRE
    0.83
    fire
    0.82
    FIRE
    0.79
     Target
    0.76
    igration
    0.73
    高速
    0.73
     Firestone
    0.73
    Act Density 0.028%

    No Known Activations