INDEX
Explanations
terms associated with targeting and goals in various contexts
New Auto-Interp
Negative Logits
dear
-0.16
IGH
-0.16
quer
-0.15
ibur
-0.15
bons
-0.15
ances
-0.14
enia
-0.14
amina
-0.14
raj
-0.14
erty
-0.14
POSITIVE LOGITS
ted
0.31
ting
0.24
="_
0.23
(Target
0.20
/target
0.18
=target
0.18
Touches
0.18
edReader
0.17
entin
0.17
tır
0.17
Activations Density 0.031%