INDEX
Explanations
adjectives followed by a noun
keywords associated with complexity and ambiguity in situations
New Auto-Interp
Negative Logits
itta
-0.62
wow
-0.61
urai
-0.61
hatt
-0.60
anwhile
-0.59
illi
-0.56
isin
-0.56
DERR
-0.55
watching
-0.55
oward
-0.55
POSITIVE LOGITS
ones
2.33
hers
1.46
ours
1.41
theirs
1.40
one
1.37
yours
1.34
Ones
1.14
mine
1.03
one
0.93
ONE
0.86
Activations Density 0.596%