INDEX
Explanations
negative states or deficiencies
New Auto-Interp
Negative Logits
ie
0.94
aya
0.91
٠
0.91
manship
0.86
iti
0.84
ができる
0.84
onde
0.81
ina
0.80
was
0.80
éges
0.80
POSITIVE LOGITS
т
0.96
ভাবে
0.91
б
0.90
к
0.83
ิศ
0.81
डायरेक्टली
0.81
ット
0.80
INSPIRE
0.79
pétales
0.78
Average
0.77
Activations Density 0.714%