INDEX
Explanations
instances of attention-grabbing or captivating actions and experiences
New Auto-Interp
Negative Logits
pits
-0.51
Skocz
-0.49
!*\
-0.49
वरी
-0.49
CloseOperation
-0.47
surla
-0.47
HOUT
-0.47
{{/-0.46
SIMBAD
-0.46
Портал
-0.45
POSITIVE LOGITS
hold
1.04
glimpse
0.97
hold
0.92
glimpses
0.82
sight
0.82
attention
0.81
basah
0.76
phrases
0.72
onto
0.72
phrase
0.71
Activations Density 0.226%