INDEX
    Explanations

    occurrences of the word "on."

    New Auto-Interp
    Negative Logits
     InputDecoration
    -0.82
    Rüyada
    -0.81
     virginity
    -0.68
    ThroughAttribute
    -0.64
    RectangleBorder
    -0.63
     UIFont
    -0.62
     pleaſure
    -0.61
     tachy
    -0.61
     myſelf
    -0.61
     isoto
    -0.60
    POSITIVE LOGITS
     based
    0.90
    Based
    0.79
     Based
    0.78
     berdasarkan
    0.77
    based
    0.72
     basada
    0.72
    @",
    0.68
    基于
    0.68
     BASED
    0.66
     основе
    0.64
    Act Density 0.179%

    No Known Activations