INDEX
Explanations
adjectives and conjunctions
words and phrases related to contradictions and comparisons
New Auto-Interp
Negative Logits
ftime
-0.71
amus
-0.70
izabeth
-0.69
herself
-0.64
exodus
-0.61
onement
-0.61
legate
-0.59
usk
-0.58
lon
-0.58
ffen
-0.58
POSITIVE LOGITS
they
1.65
They
1.51
they
1.47
They
1.41
THEY
1.39
ones
1.14
their
1.11
These
1.10
These
1.08
Their
1.04
Activations Density 0.716%