INDEX
    Explanations

    occurrences of specific unicode characters

    negations or phrases indicating disagreement

    New Auto-Interp
    Negative Logits
     Gaul
    -0.65
     jog
    -0.65
     sacrific
    -0.62
     blitz
    -0.57
     Allies
    -0.56
     capsule
    -0.56
     Stats
    -0.55
     Britons
    -0.55
     looms
    -0.55
     fumble
    -0.55
    POSITIVE LOGITS
    tre
    0.91
    ï¸ı
    0.89
    vable
    0.81
    forth
    0.76
    tu
    0.75
    emb
    0.75
    ulty
    0.75
    vent
    0.74
    yet
    0.74
    minent
    0.74
    Act Density 0.135%

    No Known Activations