INDEX
Negative Logits
ປັນ
0.56
obnoxious
0.55
чиго
0.52
ﻒ
0.50
maliciously
0.50
falsa
0.50
פ
0.49
टे
0.49
دين
0.49
sacré
0.48
POSITIVE LOGITS
Instructions
0.48
Appreciation
0.48
Bake
0.47
lä
0.47
External
0.47
Solutions
0.47
Interaction
0.46
Financial
0.46
Instruction
0.46
Finances
0.46
Activations Density 0.004%