INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    וכים
    -0.08
     pork
    -0.08
     पद
    -0.07
     gal
    -0.07
    वाद
    -0.07
    зор
    -0.07
    -0.07
    Endpoints
    -0.07
    -0.07
     endpoints
    -0.07
    POSITIVE LOGITS
     beliefs
    0.10
     provenant
    0.09
    belief
    0.08
     pensamientos
    0.08
     convictions
    0.08
     emitted
    0.08
     bezüglich
    0.08
     believing
    0.08
     thoughts
    0.07
    .sent
    0.07
    Act Density 0.013%

    No Known Activations