INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    good
    -0.07
     Basic
    -0.07
     Assistance
    -0.06
     Forest
    -0.06
     Genetic
    -0.06
     Fantastic
    -0.06
    atory
    -0.06
     Grand
    -0.06
     Boulevard
    -0.06
     HK
    -0.06
    POSITIVE LOGITS
     سور
    0.07
    董事
    0.07
     حرکت
    0.07
     فق
    0.07
    ーー
    0.07
     boycott
    0.07
    стров
    0.07
    中学
    0.06
     imprison
    0.06
    _rom
    0.06
    Act Density 0.010%

    No Known Activations