INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    insurer
    0.94
    ivores
    0.93
    <unused1296>
    0.93
    oxide
    0.91
    <unused1691>
    0.90
    <unused1612>
    0.90
    <unused1391>
    0.88
    <unused1717>
    0.88
    0.88
    <unused852>
    0.87
    POSITIVE LOGITS
     COVID
    0.89
     Scottish
    0.88
     Japanese
    0.87
     modern
    0.86
     Christian
    0.85
     българ
    0.84
     the
    0.84
     basketball
    0.83
     
    0.81
     Canadian
    0.80
    Act Density 4.740%

    No Known Activations