INDEX
Explanations
control over / dive into / focus on
New Auto-Interp
Negative Logits
한테
0.84
padă
0.79
كلهم
0.75
<unused90>
0.74
𝓾
0.73
一脸
0.73
accay
0.73
льному
0.73
jaoks
0.72
novedad
0.72
POSITIVE LOGITS
regarding
1.20
concerning
0.97
towards
0.96
toward
0.92
regarding
0.85
Regarding
0.82
surrounding
0.79
both
0.79
Regarding
0.77
not
0.75
Activations Density 0.368%