INDEX
Explanations
references to the name "Ru" with different numerical values associated with the activations
occurrences of the name "Ru" or variations of it related to different contexts or subjects
New Auto-Interp
Negative Logits
Seym
-0.67
croft
-0.66
nikov
-0.65
£ı
-0.64
chool
-0.64
sie
-0.64
)=(
-0.61
Remastered
-0.61
Palest
-0.59
Ribbon
-0.59
POSITIVE LOGITS
pees
1.24
pee
1.14
ination
1.11
ined
1.10
pert
1.00
Paul
0.99
inal
0.94
pton
0.94
inous
0.93
ining
0.92
Activations Density 0.037%