INDEX
    Explanations

    terms related to governmental or organizational structures, particularly those involving resistance or revolution

    New Auto-Interp
    Negative Logits
     houſe
    -0.68
     ſmall
    -0.67
     Dinamarca
    -0.64
     محفوظة
    -0.64
     faſt
    -0.63
     themſelves
    -0.63
     myſelf
    -0.62
     ſta
    -0.60
     deſt
    -0.60
     sánchez
    -0.59
    POSITIVE LOGITS
    TagHelper
    0.68
    UnusedPrivate
    0.67
    valt
    0.62
     neo
    0.61
     Италијани
    0.61
    eriks
    0.60
    ChildScrollView
    0.57
    neurial
    0.57
    rungsseite
    0.57
     ga
    0.56
    Act Density 0.048%

    No Known Activations