INDEX
Explanations
the concept of focus in various contexts
New Auto-Interp
Negative Logits
hans
-0.17
utan
-0.16
IMS
-0.16
Holmes
-0.16
oner
-0.15
ectors
-0.15
vin
-0.15
antine
-0.14
anson
-0.14
tl
-0.14
POSITIVE LOGITS
cus
0.21
ussed
0.20
usses
0.17
λια
0.17
uss
0.16
selling
0.15
cing
0.15
ingular
0.15
ilon
0.14
Mitch
0.14
Activations Density 0.008%