INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    .
    -0.40
    -0.36
    <eos>
    -0.36
     (
    -0.36
    ,
    -0.36
    -0.35
    0
    -0.34
     [
    -0.33
    1
    -0.33
    are
    -0.32
    POSITIVE LOGITS
    +#+
    9.19
    #+#
    2.34
    :+:
    2.20
    httphttps
    1.87
    +#+#
    1.72
     autorytatywna
    1.66
    ########.
    1.65
    Personendaten
    1.61
    ValueStyle
    1.46
     betweenstory
    1.45
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.