INDEX
    Explanations

    proper nouns and specific entities

    New Auto-Interp
    Negative Logits
    RestorePolicy
    0.37
    краё
    0.35
    ясплат
    0.35
    Điều
    0.34
    0.33
    }"
    0.33
     realtime
    0.33
    лянчук
    0.32
     эмоциона
    0.32
     normativa
    0.32
    POSITIVE LOGITS
     Antarctica
    0.46
     tobacco
    0.45
     the
    0.43
     Dracula
    0.42
     Cleopatra
    0.41
     Napoleon
    0.41
     Beethoven
    0.41
     Persia
    0.41
     chocolate
    0.41
     opium
    0.41
    Act Density 0.179%

    No Known Activations