INDEX
Explanations
verbs related to actions or processes being carried out on a large scale
actions related to shutdowns, closures, and arrests
New Auto-Interp
Negative Logits
whatever
-0.67
something
-0.61
yip
-0.60
leader
-0.56
comes
-0.56
tone
-0.55
erity
-0.55
Darius
-0.54
prompt
-0.54
bral
-0.53
POSITIVE LOGITS
apiece
0.92
consecut
0.90
abouts
0.87
individually
0.81
respectively
0.79
nationwide
0.79
simultaneously
0.78
throughout
0.75
worldwide
0.75
besides
0.74
Activations Density 0.348%