INDEX
Explanations
references to numbers or statistics
references to actions and events related to conflict or disruptions
New Auto-Interp
Negative Logits
etc
-0.61
respectively
-0.58
'.
-0.58
thereto
-0.57
*.
-0.53
accordingly
-0.53
.).
-0.53
).[
-0.52
".
-0.50
+.
-0.50
POSITIVE LOGITS
vez
0.50
endon
0.45
romeda
0.44
namese
0.43
rad
0.40
buquerque
0.39
different
0.38
illance
0.38
oug
0.38
ridor
0.38
Activations Density 2.666%