INDEX
Explanations
phrases related to activism or activist movements
terms related to activism
New Auto-Interp
Negative Logits
Oaks
-0.71
Ness
-0.68
Grind
-0.67
Hercules
-0.66
Carol
-0.65
Bake
-0.65
Neighbor
-0.65
Stafford
-0.65
lihood
-0.65
Tiger
-0.64
POSITIVE LOGITS
ists
1.12
ision
1.09
ist
1.05
ivation
1.00
uates
0.99
ivated
0.95
tion
0.95
activ
0.95
ité
0.94
ated
0.93
Activations Density 0.012%