INDEX
Negative Logits
rež
0.47
乾燥
0.42
ódio
0.41
ombres
0.41
Sumber
0.40
schéma
0.40
burst
0.40
Peral
0.39
цвето
0.39
brecht
0.39
POSITIVE LOGITS
foo
0.61
foo
0.58
Foo
0.48
Foo
0.48
confusing
0.48
problematic
0.47
ambigu
0.45
unrelated
0.45
misused
0.44
mistaken
0.43
Activations Density 0.026%