INDEX
Negative Logits
Tips
-0.08
wafer
-0.08
terminated
-0.08
работ
-0.07
teammate
-0.07
Parl
-0.07
അറ
-0.07
serv
-0.07
Guards
-0.07
Serv
-0.07
POSITIVE LOGITS
attribut
0.14
atrib
0.13
attribution
0.13
atribu
0.12
Attribution
0.11
attributed
0.11
inaccur
0.10
blame
0.10
attribute
0.10
incorrectly
0.10
Activations Density 0.076%