INDEX
    Explanations

    words before sentence terminators

    New Auto-Interp
    Negative Logits
    かもしれませんが
    0.60
    *,
    0.55
    ,[
    0.55
    *;
    0.53
    ),[
    0.52
    !),
    0.51
    :(
    0.50
    \%),
    0.48
    +,
    0.47
     totiž
    0.46
    POSITIVE LOGITS
    ."
    0.67
    0.65
    0.63
    .
    0.60
    0.55
    .\
    0.54
    ."""
    0.54
    .”
    0.54
    ".
    0.52
     Надо
    0.52
    Act Density 0.046%

    No Known Activations