INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    +#+#
    -0.66
     juſ
    -0.61
    AddTagHelper
    -0.57
    Tembelea
    -0.54
    ">//
    -0.53
     pleaſure
    -0.52
     faſt
    -0.52
     myſelf
    -0.49
     ſche
    -0.49
     مشارکت‌کنندگان
    -0.49
    POSITIVE LOGITS
    _
    0.52
     הנו
    0.45
    0.44
    chartInstance
    0.42
    PRF
    0.40
    August
    0.40
    Bob
    0.40
    ˛
    0.39
    ubby
    0.39
    danno
    0.39
    Act Density 0.017%

    No Known Activations