INDEX
Explanations
nouns and definite articles in German language texts
New Auto-Interp
Negative Logits
Leben
-0.17
Antworten
-0.17
jay
-0.16
Unternehmen
-0.16
FC
-0.16
Angebot
-0.15
Thema
-0.15
ja
-0.15
Fragen
-0.14
jin
-0.14
POSITIVE LOGITS
Sz
0.18
Phase
0.18
Minute
0.17
Offensive
0.17
Serie
0.17
Eb
0.17
Palette
0.17
Episode
0.16
Exist
0.16
uong
0.16
Activations Density 0.026%