INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    🧢
    -0.96
     groen
    -0.95
     gracilis
    -0.94
     schild
    -0.94
     witte
    -0.93
     arşivlendi
    -0.93
    õe
    -0.92
     inz
    -0.91
     klasse
    -0.91
     stadion
    -0.90
    POSITIVE LOGITS
     things
    3.09
     unexpected
    2.39
     accidents
    2.20
     unforeseen
    2.03
    Things
    2.03
     Things
    1.94
     circumstances
    1.91
     stuff
    1.88
     sometimes
    1.86
    things
    1.86
    Act Density 0.046%

    No Known Activations