INDEX
    Explanations

    occurrences of the word "new"

    New Auto-Interp
    Negative Logits
     kaarangay
    -0.56
    Personendaten
    -0.49
    ########.
    -0.47
    queryInterface
    -0.47
    postIndex
    -0.47
    ConstraintMaker
    -0.46
     Мексичка
    -0.45
     препратки
    -0.44
    Portály
    -0.44
    Vidite
    -0.43
    POSITIVE LOGITS
     new
    0.58
     Mark
    0.54
     Newberry
    0.53
     Taw
    0.51
     Great
    0.51
     Green
    0.51
     Main
    0.50
     New
    0.50
    ↵↵
    0.49
    Mark
    0.49
    Act Density 0.002%

    No Known Activations