INDEX
Explanations
actions or events involving physical movements or interactions between people in various settings
New Auto-Interp
Negative Logits
-)
-0.63
Reviewer
-0.59
itialized
-0.53
rely
-0.52
Timeline
-0.51
unker
-0.50
?)
-0.49
essor
-0.48
pires
-0.46
bernatorial
-0.45
POSITIVE LOGITS
.[
0.88
.
0.86
whereas
0.83
whilst
0.82
while
0.81
because
0.78
;
0.74
,[
0.73
.;
0.71
hoping
0.71
Activations Density 1.081%