INDEX
    Explanations

    phrases or questions that inquire about types or categories of things

    New Auto-Interp
    Negative Logits
     illustrazione
    -0.63
     aérea
    -0.63
     évêque
    -0.63
     démission
    -0.62
     frequenza
    -0.58
     preghiera
    -0.57
     nervioso
    -0.57
     biologie
    -0.57
     lettura
    -0.57
     menikah
    -0.57
    POSITIVE LOGITS
    "):
    
    0.95
    ".
    
    0.92
    mergeFrom
    0.83
    ...',
    0.83
    )";
    
    0.82
    ...",
    0.81
    0.81
    ":
    
    0.80
    ;'>
    0.79
    \"",
    0.78
    Act Density 0.103%

    No Known Activations