INDEX
Negative Logits
Mathemat
0.43
मध्ये
0.42
stylesheets
0.42
オリ
0.41
differential
0.40
execut
0.40
einander
0.39
presentan
0.39
නිෂ්පා
0.39
constructs
0.39
POSITIVE LOGITS
sorry
0.74
Sorry
0.67
sorry
0.65
okay
0.59
okay
0.57
Sorry
0.56
It
0.52
Okay
0.50
걍
0.49
Ok
0.49
Activations Density 0.000%