INDEX
Explanations
specific nouns and concepts
New Auto-Interp
Negative Logits
ibus
1.00
until
0.89
obil
0.87
until
0.83
abus
0.82
elu
0.81
antle
0.81
ex
0.79
tif
0.79
elm
0.78
POSITIVE LOGITS
konkrét
1.53
συγκεκρι
1.31
konkre
1.30
belirli
1.30
quelcon
1.28
afferma
1.28
különböző
1.26
farklı
1.26
ప్
1.23
gerçekten
1.23
Activations Density 0.376%