INDEX
Explanations
references to organizational structures and authority figures
New Auto-Interp
Negative Logits
alam
-0.17
ORMAT
-0.16
ptune
-0.15
adera
-0.15
importe
-0.15
imentos
-0.15
enties
-0.14
Zoom
-0.14
coni
-0.14
avia
-0.14
POSITIVE LOGITS
Charm
0.19
odynam
0.18
ève
0.17
ê
0.15
aby
0.15
yk
0.15
æķĪ
0.14
fast
0.14
ree
0.14
gang
0.14
Activations Density 0.043%