INDEX
Explanations
the words "hip hop" or job titles containing "vice president.". They are distinct enough that perhaps the model is using "hip" alone to find "hip hop" and "vice" alone to find job titles
New Auto-Interp
Negative Logits
ReusableCell
-0.93
resourceCulture
-0.89
CreateTagHelper
-0.79
RetentionPolicy
-0.74
تقاوى
-0.74
kaynağından
-0.71
GrantedAuthority
-0.70
HasFactory
-0.69
JpaRepository
-0.68
TableHead
-0.68
POSITIVE LOGITS
<bos>
1.84
'
0.50
@
0.50
x
0.47
u
0.47
t
0.46
H
0.45
Enter
0.45
di
0.45
is
0.45
Activations Density 0.187%