INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.31
     the
    0.31
     some
    0.30
     (
    0.30
     brain
    0.29
     a
    0.29
     pre
    0.28
     full
    0.27
     security
    0.27
     namesake
    0.27
    POSITIVE LOGITS
    pleClass
    0.41
     onların
    0.41
     स्क्यर
    0.40
    BUGFS
    0.40
     paraissent
    0.40
    ówczas
    0.39
     felicidad
    0.39
    <unused1680>
    0.39
     individuos
    0.39
     Sedangkan
    0.38
    Act Density 6.263%

    No Known Activations