INDEX
Negative Logits
NAM
0.42
execute
0.41
unexpected
0.37
swearing
0.37
execution
0.37
inator
0.37
kne
0.37
entwicklung
0.36
NBA
0.36
complement
0.36
POSITIVE LOGITS
class
0.44
classe
0.43
শ্রেণির
0.42
kelas
0.41
Klasse
0.40
stacks
0.39
کلاس
0.38
класу
0.38
മൂഹ
0.38
class
0.38
Activations Density 0.000%