INDEX
Negative Logits
IsContent
-0.66
CreateTagHelper
-0.57
الحره
-0.55
neos
-0.52
Portale
-0.52
ệnh
-0.50
Autoritní
-0.49
Paglinawan
-0.47
strick
-0.47
<!--[
-0.46
POSITIVE LOGITS
poffe
0.63
againſt
0.57
favoritas
0.56
himſelf
0.55
optique
0.53
seventh
0.53
aisladas
0.53
poffible
0.51
themſelves
0.51
########.
0.51
Activations Density 0.001%