INDEX
Explanations
names of people and places
mentions of specific individuals or proper nouns
New Auto-Interp
Negative Logits
bottleneck
-0.77
ģĸ
-0.67
ĸļ
-0.66
£ı
-0.66
Thumbnails
-0.65
Gemini
-0.63
limiting
-0.63
*.
-0.62
renamed
-0.62
sidebar
-0.62
POSITIVE LOGITS
isha
1.01
add
0.99
ona
0.98
osh
0.97
ok
0.96
annie
0.96
anta
0.96
FA
0.96
azz
0.95
aj
0.93
Activations Density 0.219%