INDEX
Negative Logits
Lau
-0.07
çocuğ
-0.06
slots
-0.06
lf
-0.06
helfen
-0.06
LGBTQ
-0.06
Wars
-0.06
-0.06
SDLK
-0.06
Kaf
-0.06
POSITIVE LOGITS
-confidence
0.07
ibia
0.07
smoothed
0.06
.Multi
0.06
默认
0.06
civilian
0.06
patterns
0.06
aligned
0.06
Difficulty
0.06
Initialized
0.06
Activations Density 0.012%