INDEX
Explanations
names or words related to people's identities
proper nouns and names of people or entities
New Auto-Interp
Negative Logits
WINDOWS
-0.83
IX
-0.81
powd
-0.81
çĦ
-0.81
Sup
-0.80
Instr
-0.78
OCT
-0.76
س
-0.76
NEC
-0.74
NB
-0.74
POSITIVE LOGITS
er
1.74
ER
1.55
ers
1.48
erker
1.37
ler
1.35
eric
1.28
ller
1.27
uer
1.27
erer
1.24
zer
1.23
Activations Density 0.188%