INDEX
    Explanations

    history, geography, and extent

    New Auto-Interp
    Negative Logits
    0
    0.50
     pux
    0.45
     bestimmten
    0.43
     szok
    0.43
    تين
    0.41
    新的
    0.41
    ä
    0.41
     obat
    0.41
    やっぱり
    0.41
    0.40
    POSITIVE LOGITS
     throughout
    0.73
     Throughout
    0.59
    Throughout
    0.56
     분야
    0.53
     tutta
    0.48
     THRO
    0.48
     вси
    0.48
     всички
    0.48
     усі
    0.46
     всю
    0.45
    Act Density 0.004%

    No Known Activations