INDEX
    Explanations

    sentences that express thoughts, beliefs, or opinions

    New Auto-Interp
    Negative Logits
     apparently
    -0.84
     seemingly
    -0.83
     Apparently
    -0.77
     aparentemente
    -0.75
    apparently
    -0.71
     Seem
    -0.69
    Apparently
    -0.68
     schein
    -0.66
     supposedly
    -0.64
    -0.63
    POSITIVE LOGITS
     overall
    0.55
     probably
    0.49
     yüzden
    0.48
    Probably
    0.45
     partly
    0.45
    もっと
    0.44
     personally
    0.43
     mostly
    0.42
     mainly
    0.42
     ultimately
    0.42
    Act Density 0.244%

    No Known Activations