INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     peroxide
    0.43
    Cars
    0.42
     Per
    0.41
    нову
    0.41
     car
    0.41
     Hispan
    0.40
     glimpses
    0.39
     off
    0.38
    scripts
    0.38
     cast
    0.38
    POSITIVE LOGITS
     Komitet
    0.44
    青少年
    0.38
    herr
    0.38
    0.37
    0.37
     fabricating
    0.36
    0.36
    oscow
    0.36
     člán
    0.36
    يم
    0.35
    Act Density 0.000%

    No Known Activations