INDEX
Explanations
phrases related to actions carried out by individuals or groups of people
past tense verbs indicating significant actions or events
New Auto-Interp
Negative Logits
():
-0.70
veland
-0.63
mania
-0.61
iac
-0.61
yond
-0.59
angs
-0.58
omorph
-0.58
KY
-0.58
avin
-0.57
xual
-0.57
POSITIVE LOGITS
themselves
1.14
respectively
1.02
unanimously
0.85
collectively
0.83
their
0.79
respective
0.74
alike
0.74
unison
0.71
jointly
0.71
together
0.71
Activations Density 0.617%