INDEX
Explanations
instances of the word "within" and variations of it, suggesting it detects discussions about containment or context
New Auto-Interp
Negative Logits
er
-0.77
O
-0.69
cy
-0.69
lar
-0.69
io
-0.69
es
-0.67
'
-0.65
گ
-0.64
pep
-0.64
ar
-0.63
POSITIVE LOGITS
within
1.85
Within
1.82
Within
1.78
within
1.72
WITHIN
1.72
InputDecoration
1.25
binnen
1.23
entro
1.12
innerhalb
1.12
dentro
1.09
Activations Density 0.080%