INDEX
Explanations
proper nouns related to individuals
the word "mann" in various contexts
New Auto-Interp
Negative Logits
nces
-0.71
ngth
-0.70
WHERE
-0.69
welf
-0.67
rip
-0.65
=-=-=-=-
-0.65
bound
-0.64
Citiz
-0.62
cess
-0.62
vette
-0.61
POSITIVE LOGITS
mann
0.97
elson
0.90
strom
0.90
enegger
0.86
otti
0.84
ufact
0.83
ogl
0.79
stein
0.79
ophon
0.78
gart
0.78
Activations Density 0.022%