INDEX
Negative Logits
acky
0.42
Intern
0.38
Intern
0.38
oria
0.36
achel
0.36
osim
0.36
igi
0.36
úly
0.36
som
0.35
㙋
0.35
POSITIVE LOGITS
Dorm
0.54
itories
0.53
dorm
0.50
dorm
0.44
sfai
0.43
ITATION
0.43
alfabeto
0.42
인생
0.41
Ⴌ
0.40
ității
0.40
Activations Density 0.001%