INDEX
Explanations
regardless of circumstances
New Auto-Interp
Negative Logits
Ха
0.93
ка
0.91
важных
0.84
ׁ
0.83
взаимодействия
0.82
важное
0.81
communautés
0.80
файлов
0.80
خبری
0.80
主な
0.80
POSITIVE LOGITS
whether
1.43
whether
1.18
Whether
1.16
imperfect
1.12
Whether
1.10
WHETHER
1.06
bizarre
1.03
or
1.03
misguided
0.97
occasional
0.96
Activations Density 0.076%