INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
ul
0.65
ب
0.58
िक
0.55
n
0.54
s
0.53
يا
0.51
其他
0.50
های
0.49
ą
0.49
م
0.48
POSITIVE LOGITS
۰
0.81
↵
0.64
।’
0.59
can
0.58
।
0.58
۵
0.57
I
0.55
۹
0.54
,’
0.53
।”
0.52
Activations Density 0.000%
No Known Activations
This feature has no known activations.