INDEX
Explanations
proper nouns related to names of people or places
the presence of a specific name or person, likely related to a public figure or news topic
New Auto-Interp
Negative Logits
wide
-0.71
brackets
-0.66
boost
-0.64
extensions
-0.63
fixed
-0.63
limits
-0.60
IM
-0.60
extend
-0.59
growth
-0.58
extension
-0.58
POSITIVE LOGITS
ÃŃa
4.69
ÃŃ
1.63
Ãį
1.59
ÃŃn
1.54
ás
1.49
ÃŃs
1.49
ón
1.40
ia
1.32
idad
1.30
iola
1.26
Activations Density 0.008%