INDEX
Explanations
numerical values or statistics related to significant concepts.
numerical values interspersed with specific terms and names.
prepositions, particularly "von", "aus", "in", "mit", and "bei" in German text.
Following the words "von," "dem," "der," or "den"
"von" followed by names
New Auto-Interp
Negative Logits
Monfieur
-0.86
Efq
-0.82
itſelf
-0.82
himſelf
-0.80
Majefty
-0.79
themſelves
-0.78
myſelf
-0.78
alſo
-0.77
faſt
-0.76
pleaſure
-0.75
POSITIVE LOGITS
einer
0.78
dem
0.77
einem
0.76
der
0.75
den
0.55
()){
0.52
divers
0.49
MLLoader
0.48
mehreren
0.48
seinem
0.47
Activations Density 0.037%