INDEX
    Explanations

    words containing non-English characters, specifically umlauts and other accents

    specific names and terms associated with individuals and locations

    New Auto-Interp
    Negative Logits
     Debor
    -0.83
     Fired
    -0.73
     Reef
    -0.73
     Rhino
    -0.69
     scorp
    -0.68
    ORK
    -0.67
     Turtles
    -0.66
    BILITIES
    -0.65
    yrinth
    -0.64
    ombat
    -0.64
    POSITIVE LOGITS
    ön
    1.44
    vironment
    1.00
    ning
    0.85
    ü
    0.83
    bach
    0.81
    ä
    0.79
    icht
    0.79
    kamp
    0.77
    agar
    0.76
    ollen
    0.76
    Act Density 0.008%

    No Known Activations