INDEX
Explanations
references to IP addresses
citations and footnotes
New Auto-Interp
Negative Logits
verrez
-0.50
<bos>
-0.48
Hogar
-0.47
ymce
-0.47
industriel
-0.47
contactez
-0.46
femeninos
-0.46
religieux
-0.46
bancaire
-0.46
commerciaux
-0.46
POSITIVE LOGITS
]:
1.73
]]:
1.20
]:
1.14
"]:
1.02
']:
1.02
]:
0.98
:]:
0.92
>:
0.91
)]:
0.90
])):
0.86
Activations Density 0.007%