INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     impartiality
    0.43
     morals
    0.42
     apostolic
    0.42
     opio
    0.39
     sanctity
    0.39
     Admit
    0.38
     disabilities
    0.38
    decimals
    0.38
     Reven
    0.37
     cleanliness
    0.37
    POSITIVE LOGITS
     regroup
    0.45
     analysed
    0.44
    0.44
     رہنے
    0.42
     ગા
    0.41
    0.41
    管理的
    0.41
    转换
    0.40
    ëren
    0.40
    0.39
    Act Density 0.254%

    No Known Activations