INDEX
Explanations
"personal" before categories
New Auto-Interp
Negative Logits
provide
0.48
Па
0.44
a
0.41
خ
0.40
Бе
0.39
া
0.39
Су
0.39
ה
0.39
Ж
0.38
bless
0.38
POSITIVE LOGITS
personal
1.16
Personal
1.02
personal
0.98
Personal
0.96
PERSONAL
0.93
persoonlijke
0.90
개인
0.89
पर्सनल
0.85
προσωπ
0.84
व्यक्तिगत
0.83
Activations Density 0.031%