INDEX
    Explanations

    negations or the word "not."

    New Auto-Interp
    Negative Logits
     insuffisamment
    -0.56
     cannot
    -0.51
    Organisateur
    -0.49
     tidak
    -0.45
     concluded
    -0.44
     even
    -0.43
     ikke
    -0.43
     không
    -0.43
     occasionally
    -0.42
     non
    -0.41
    POSITIVE LOGITS
    buy
    0.57
    Screen
    0.56
    not
    0.52
    Moon
    0.51
    :✨
    0.50
    crops
    0.50
    moon
    0.49
    Theme
    0.49
    proposition
    0.49
     noDo
    0.49
    Act Density 0.475%

    No Known Activations