INDEX
Explanations
proper nouns
references to activism and activist-related activities
New Auto-Interp
Negative Logits
breath
-0.71
Ellen
-0.68
dreadful
-0.61
commencement
-0.60
Pell
-0.59
whence
-0.59
Hardy
-0.58
tasting
-0.58
SEAL
-0.58
Robin
-0.58
POSITIVE LOGITS
ists
1.53
ities
1.48
ision
1.43
ist
1.42
ité
1.40
ations
1.21
ation
1.13
ism
1.13
ated
1.10
istic
1.09
Activations Density 0.040%