INDEX
    Explanations

    phrases indicating a debate or controversy

    New Auto-Interp
    Negative Logits
     متعلقه
    -0.64
    ̍t
    -0.58
    دانشنامهٔ
    -0.57
    reactstrap
    -0.56
    -0.54
    <bos>
    -0.54
    Personensuche
    -0.53
    sizeCache
    -0.52
    ɚ
    -0.51
    WriteTagHelper
    -0.51
    POSITIVE LOGITS
     little
    2.05
     nothing
    1.74
    little
    1.58
     no
    1.50
     zero
    1.49
     minimal
    1.45
     few
    1.40
     none
    1.39
     ZERO
    1.29
     weinig
    1.29
    Act Density 0.873%

    No Known Activations