INDEX
Explanations
names describing people/organizations
New Auto-Interp
Negative Logits
v
0.50
žene
0.48
il
0.48
aları
0.46
hugged
0.46
ata
0.45
w
0.45
Penang
0.45
нию
0.45
ası
0.44
POSITIVE LOGITS
輋
0.52
⺀
0.51
investig
0.49
ถมศึกษา
0.49
Untersuch
0.47
das
0.47
[
0.47
comments
0.46
investigator
0.46
ากร
0.45
Activations Density 0.000%