INDEX
    Explanations

    instances of reported beliefs or thoughts expressed in various forms

    New Auto-Interp
    Negative Logits
    awtextra
    -0.64
     ElementRef
    -0.61
    InjectAttribute
    -0.61
     chi̍t
    -0.61
     Chwiliwch
    -0.59
     réessayer
    -0.58
    
    -0.56
    ίδα
    -0.53
     Ause
    -0.53
    yssey
    -0.53
    POSITIVE LOGITS
     Wird
    0.81
    Wird
    0.79
     enkelte
    0.70
     yapılan
    0.70
     mennes
    0.69
     tempio
    0.67
     inimes
    0.66
     brukes
    0.64
     eaten
    0.63
     verrà
    0.62
    Act Density 0.459%

    No Known Activations