INDEX
    Explanations

    phrases indicating causation or influence in emotional contexts

    New Auto-Interp
    Negative Logits
    (!__
    -0.46
    ſelf
    -0.34
     parallèle
    -0.34
    今度は
    -0.32
     colegios
    -0.32
    ğraf
    -0.32
     NDEBUG
    -0.32
     Exactos
    -0.32
    -0.31
    知らない
    -0.31
    POSITIVE LOGITS
    Makes
    0.87
     Makes
    0.82
    makes
    0.80
     makes
    0.76
     MAKES
    0.63
    Feels
    0.56
    󠁢
    0.56
    feels
    0.55
     bikin
    0.55
     make
    0.54
    Act Density 0.156%

    No Known Activations