INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     repent
    -0.08
    irge
    -0.08
     ular
    -0.08
     devote
    -0.08
     perempuan
    -0.08
    CCI
    -0.08
     минист
    -0.08
    оратив
    -0.07
     eingerichtet
    -0.07
     ома
    -0.07
    POSITIVE LOGITS
     hypotheses
    0.15
     hypothesis
    0.13
     hypoth
    0.11
     לגבי
    0.10
     regarding
    0.10
     beliefs
    0.09
     premises
    0.09
     conject
    0.09
     assumptions
    0.09
     بشأن
    0.09
    Act Density 0.015%

    No Known Activations