INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     disaster
    -1.58
    Disaster
    -1.55
    disaster
    -1.44
     Disaster
    -1.40
     disasters
    -1.22
     desastre
    -1.16
     catastrophe
    -1.09
     calamity
    -1.02
     disastrous
    -0.90
     katastro
    -0.90
    POSITIVE LOGITS
    y
    0.63
     BoxFit
    0.60
    d
    0.55
     mode
    0.55
    able
    0.53
     незавершена
    0.52
    ment
    0.51
    rungsseite
    0.50
    الدراسه
    0.50
    ه
    0.49
    Act Density 0.051%

    No Known Activations