INDEX
Explanations
specific items followed by a colon
New Auto-Interp
Negative Logits
Р
1.17
ことから
1.11
্শন
1.11
прият
1.07
Ма
1.02
ся
1.02
יות
1.01
ෙන
0.98
Не
0.96
Faites
0.96
POSITIVE LOGITS
luxury
1.03
lur
0.97
institutional
0.95
€˜
0.93
irikan
0.89
jungle
0.89
ந
0.89
modest
0.89
frontline
0.89
phonon
0.89
Activations Density 0.006%