INDEX
Explanations
phrases related to comparisons or relationships between different entities
references to relationships or comparisons between two entities
New Auto-Interp
Negative Logits
onom
-0.68
rison
-0.64
NAME
-0.63
ERSON
-0.60
ç¥ŀ
-0.59
Ħ¢
-0.56
Lear
-0.54
oran
-0.54
id
-0.53
vana
-0.53
POSITIVE LOGITS
between
3.30
between
3.02
Between
2.40
Between
1.97
BET
1.20
separating
1.20
among
1.12
amongst
1.11
around
0.96
across
0.94
Activations Density 0.058%