INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Citiți
    -0.48
     AssemblyTitle
    -0.48
     MotionEvent
    -0.47
    TouchableOpacity
    -0.43
     Femen
    -0.42
    bebasan
    -0.41
    Bakgrunnsstoff
    -0.41
    nables
    -0.41
     CRIMINAL
    -0.41
     Kriminal
    -0.40
    POSITIVE LOGITS
     soup
    0.84
     Soup
    0.81
    Soup
    0.76
     soups
    0.75
    soup
    0.61
     broth
    0.59
    dafx
    0.56
    スープ
    0.54
    味噌汁
    0.50
     soupe
    0.50
    Act Density 0.015%

    No Known Activations