INDEX
    Explanations

    adverbs that describe careful or precise actions

    New Auto-Interp
    Negative Logits
     hende
    -0.61
     optimisation
    -0.60
     Monuments
    -0.57
     Galleria
    -0.55
     weathering
    -0.55
    есть
    -0.55
     nodding
    -0.54
     brightened
    -0.54
     Comparable
    -0.53
     Zä
    -0.53
    POSITIVE LOGITS
    correctly
    0.86
     safely
    0.83
    cibly
    0.81
     directly
    0.77
     successfully
    0.75
     silently
    0.74
    )._
    0.74
    conditionally
    0.72
    directly
    0.72
     efficiently
    0.72
    Act Density 0.496%

    No Known Activations