INDEX
Explanations
phrases related to people coming in to take action
instances of coordination or connection between actions or subjects
New Auto-Interp
Negative Logits
ictions
-0.73
chance
-0.62
ansk
-0.60
centr
-0.60
Nept
-0.58
ongo
-0.57
acci
-0.57
ung
-0.56
altern
-0.56
until
-0.56
POSITIVE LOGITS
wre
0.95
greet
0.86
greeted
0.80
greets
0.79
bite
0.71
slay
0.69
demanded
0.68
apologized
0.67
promptly
0.67
ocide
0.66
Activations Density 0.244%