INDEX
Explanations
introductory phrases and transitional phrases indicating the start of sentences or clauses
New Auto-Interp
Negative Logits
etri
-0.18
rvé
-0.16
hoch
-0.16
fal
-0.14
iversal
-0.14
hq
-0.14
nearest
-0.14
ries
-0.13
Carn
-0.13
ahas
-0.13
POSITIVE LOGITS
alue
0.16
tie
0.16
orough
0.15
Additionally
0.15
Additionally
0.15
ÑĤап
0.15
nữa
0.14
tabpanel
0.14
dÄĽ
0.14
å¼Ħ
0.14
Activations Density 0.102%