INDEX
    Explanations

    quantitative data and numeric values in a structured context, likely related to statistical or mathematical content

    New Auto-Interp
    Negative Logits
    ロウィン
    -0.80
    ſſung
    -0.75
     zwiſchen
    -0.73
     Administrativna
    -0.71
     パンチラ
    -0.71
     imagui
    -0.70
    <unused80>
    -0.69
    <unused68>
    -0.69
    [@BOS@]
    -0.69
    <unused3>
    -0.69
    POSITIVE LOGITS
    thée
    0.27
    rzost
    0.26
     valable
    0.25
    é
    0.25
    textTheme
    0.25
     dégustation
    0.25
     mesmo
    0.25
    ang
    0.25
    <eos>
    0.24
     zelf
    0.24
    Act Density 0.096%

    No Known Activations