INDEX
    Explanations

    occurrences of the word "of"

    New Auto-Interp
    Negative Logits
    olumn
    -0.15
    lich
    -0.14
    ileo
    -0.14
    anova
    -0.14
    illion
    -0.13
    ac
    -0.13
    roz
    -0.13
     Guth
    -0.13
    gorit
    -0.13
    owers
    -0.13
    POSITIVE LOGITS
     few
    0.20
     Europe
    0.19
    LETE
    0.17
     America
    0.16
    Msp
    0.15
    ç½²
    0.15
    few
    0.14
    America
    0.14
    Few
    0.14
     Few
    0.14
    Act Density 0.046%

    No Known Activations