INDEX
Explanations
words related to German or Swedish letters and specific names
New Auto-Interp
Negative Logits
Gemini
-0.66
ibly
-0.63
naires
-0.63
ibility
-0.62
aneously
-0.60
blindness
-0.60
ibilities
-0.59
Hearts
-0.58
Cobra
-0.58
Chavez
-0.57
POSITIVE LOGITS
zbek
1.15
ö
1.02
ller
0.98
·
0.95
¸
0.95
hl
0.92
rm
0.92
misc
0.91
vre
0.90
kk
0.88
Activations Density 0.021%