INDEX
Explanations
punctuation marks and special characters
New Auto-Interp
Negative Logits
بيها
-0.82
AsUp
-0.71
Hamb
-0.70
ΜΑ
-0.69
Homme
-0.66
Merid
-0.64
Anz
-0.64
adona
-0.64
Oid
-0.64
(\%
-0.64
POSITIVE LOGITS
])).
1.11
))).
1.05
})).
1.02
)){1.01
)).
0.95
}),
0.94
")).
0.88
])),
0.88
)),
0.87
))),
0.87
Activations Density 0.731%