INDEX
Negative Logits
instruction
0.46
Signal
0.45
signal
0.44
Instruction
0.44
signaled
0.42
Conflict
0.41
SSID
0.41
পদ্ধতির
0.41
instructions
0.40
Token
0.40
POSITIVE LOGITS
robots
0.98
robotics
0.94
机器人
0.94
robots
0.90
robot
0.89
robot
0.86
ロボ
0.83
робо
0.82
Robots
0.82
Robotics
0.79
Activations Density 0.016%