INDEX
    Explanations

    references to features in various contexts

    New Auto-Interp
    Negative Logits
     pregunto
    -0.88
    Alp
    -0.87
     Pamph
    -0.84
     PVA
    -0.80
     photolibrary
    -0.80
     geboren
    -0.80
     pessoais
    -0.80
     meisten
    -0.78
     clearColor
    -0.77
    thschild
    -0.77
    POSITIVE LOGITS
     features
    2.95
     feature
    2.74
     Features
    2.64
    features
    2.51
    Features
    2.47
     Feature
    2.46
     FEATURE
    2.38
    Feature
    2.36
     FEATURES
    2.32
    feature
    2.31
    Act Density 0.061%

    No Known Activations