INDEX
    Explanations

    Roman history

    New Auto-Interp
    Negative Logits
     obsessive
    -0.10
     humming
    -0.08
    ^↵↵
    -0.08
     previo
    -0.08
    Palindrome
    -0.08
     exakt
    -0.08
     destino
    -0.07
     Lightweight
    -0.07
    -0.07
     palindrome
    -0.07
    POSITIVE LOGITS
     המרכז
    0.08
     הדר
    0.07
     الدفاع
    0.07
     Augusta
    0.07
     זכ
    0.07
    0.07
     כמו
    0.07
     FED
    0.07
     proclam
    0.07
     מרכז
    0.07
    Act Density 0.001%

    No Known Activations