INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Roskov
    -1.34
     fans
    -1.26
     followers
    -1.23
     audiences
    -1.22
     myſelf
    -1.17
    HasAnnotation
    -1.16
    fans
    -1.15
    fillType
    -1.15
    migrationBuilder
    -1.14
     Paglinawan
    -1.13
    POSITIVE LOGITS
     of
    0.71
    ,
    0.54
    .
    0.53
    ...
    0.47
    !
    0.47
    ↵↵
    0.47
    ..
    0.45
    <eos>
    0.44
     are
    0.42
    0.42
    Act Density 0.128%

    No Known Activations