INDEX
Negative Logits
𝙀
0.49
као
0.48
igneur
0.48
depending
0.48
кен
0.47
ütfen
0.46
餞
0.46
وامی
0.45
зне
0.45
던
0.45
POSITIVE LOGITS
Diagn
0.48
Tisch
0.45
ODE
0.45
Narr
0.45
Cafe
0.43
Bets
0.43
therapist
0.43
humanities
0.42
R
0.42
café
0.42
Activations Density 0.001%