INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     was
    -0.73
    was
    -0.67
     is
    -0.57
    has
    -0.57
     Was
    -0.56
    Was
    -0.55
     Has
    -0.55
     has
    -0.54
    Has
    -0.54
     in
    -0.53
    POSITIVE LOGITS
     aren
    0.79
     are
    0.75
     operate
    0.74
     donate
    0.73
    Šaltiniai
    0.72
     remind
    0.70
    Personendaten
    0.69
    IBOutlet
    0.68
     violate
    0.68
     belong
    0.65
    Act Density 0.247%

    No Known Activations