INDEX
    Explanations

    references to user preferences or settings

    New Auto-Interp
    Negative Logits
     ​​
    -0.48
     gang
    -0.46
    <bos>
    -0.46
     out
    -0.46
     army
    -0.45
    ll
    -0.44
     crimin
    -0.44
     Sint
    -0.44
     Johns
    -0.42
     Helico
    -0.42
    POSITIVE LOGITS
     Preferences
    1.67
     preferences
    1.66
    Preferences
    1.62
    preferences
    1.56
     Preference
    1.35
     preference
    1.34
    Preference
    1.28
    preference
    1.22
     preferencias
    1.19
     preferencia
    1.18
    Act Density 0.006%

    No Known Activations