INDEX
Negative Logits
basically
0.48
basically
0.45
basics
0.43
uo
0.40
Basically
0.38
Basically
0.38
IGet
0.38
migr
0.37
anus
0.37
也沒有
0.37
POSITIVE LOGITS
sorry
0.66
Sorry
0.65
Sorry
0.60
sorry
0.57
सॉरी
0.56
regret
0.53
regretted
0.53
regrett
0.47
apology
0.46
сожалению
0.43
Activations Density 0.004%