INDEX
Negative Logits
Every
0.47
ровка
0.46
отрима
0.45
நான்கு
0.45
mudança
0.44
наші
0.44
можуть
0.44
refusal
0.43
monstros
0.43
0.43
POSITIVE LOGITS
bese
0.38
guna
0.37
parallel
0.37
attached
0.36
yl
0.35
poll
0.35
Schlü
0.35
speaker
0.34
part
0.34
deceased
0.34
Activations Density 0.019%