INDEX
Negative Logits
و
0.95
o
0.84
0.75
h
0.73
↵
0.72
ный
0.71
al
0.70
ل
0.67
불구하고
0.67
वळ
0.66
POSITIVE LOGITS
0
0.95
ຖືກ
0.81
ths
0.79
fêtes
0.78
addicts
0.77
truths
0.77
piccola
0.77
Jokes
0.77
দাতা
0.76
Props
0.75
Activations Density 0.005%
و
o
h
↵
ный
al
ل
불구하고
वळ
0
ຖືກ
ths
fêtes
addicts
truths
piccola
Jokes
দাতা
Props