INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     dish
    -0.89
    dish
    -0.69
     dishes
    -0.68
    s
    -0.61
     Dish
    -0.57
     Dishes
    -0.57
     League
    -0.55
    t
    -0.53
     gaming
    -0.51
    AxisAlignment
    -0.50
    POSITIVE LOGITS
     autorytatywna
    0.96
     Jefus
    0.85
     Efq
    0.85
     Monfieur
    0.81
    сылкі
    0.76
    DoubleQuotes
    0.76
    modelBuilder
    0.72
     الحره
    0.71
     Majefty
    0.71
     Chrift
    0.68
    Act Density 1.279%

    No Known Activations