INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Taurus
    -0.73
    리스
    -0.72
     sufferings
    -0.71
     dianteiro
    -0.68
    口水
    -0.66
    -0.66
    ResourceId
    -0.66
    сія
    -0.66
     sufferer
    -0.66
     vergessen
    -0.65
    POSITIVE LOGITS
     option
    1.19
     options
    1.11
    OPTION
    1.08
     Option
    1.04
    option
    1.02
     Options
    1.00
    OPTIONS
    0.98
    getOptions
    0.92
     opciones
    0.91
    options
    0.89
    Act Density 0.009%

    No Known Activations