INDEX
    Explanations

    references to camaraderie or companionship

    New Auto-Interp
    Negative Logits
    Annette
    -0.76
     Annette
    -0.73
    pwr
    -0.69
     tarvit
    -0.68
     něko
    -0.66
    stalline
    -0.65
    ranton
    -0.65
     Kanz
    -0.65
    creas
    -0.64
    ricanes
    -0.64
    POSITIVE LOGITS
     Fellows
    1.30
    Fellow
    1.23
     Fellow
    1.21
     FELLOW
    1.20
     Fellowship
    1.00
     Fellowships
    0.99
    fellow
    0.95
     fellow
    0.90
     fellows
    0.88
     fellowship
    0.84
    Act Density 0.005%

    No Known Activations