INDEX
Explanations
phrases emphasizing collective action or responsibility
references to collective or joint actions and contributions
New Auto-Interp
Negative Logits
Elys
-0.82
ysis
-0.71
yle
-0.68
gery
-0.67
privilege
-0.66
Ale
-0.66
Kelvin
-0.65
Guardian
-0.65
Gore
-0.65
passage
-0.63
POSITIVE LOGITS
umbered
0.91
assisted
0.81
narrated
0.80
identifiable
0.75
apologized
0.75
gebra
0.73
apologize
0.73
spaced
0.72
separated
0.72
speaking
0.71
Activations Density 0.022%