INDEX
Explanations
actions of physical struggle or conflict
instances of violent or aggressive actions in the text
New Auto-Interp
Negative Logits
ãĤ´ãĥ³
-0.99
etheless
-0.81
ãĤ¦ãĤ¹
-0.77
fine
-0.76
vre
-0.74
enz
-0.70
TPP
-0.70
eah
-0.69
Higher
-0.68
conom
-0.68
POSITIVE LOGITS
however
1.19
passers
0.93
she
0.93
though
0.86
they
0.84
Sergeant
0.83
he
0.82
Sgt
0.80
onlook
0.76
meanwhile
0.74
Activations Density 0.188%