INDEX
Explanations
names of individuals, likely related to news articles or publications
proper nouns, particularly names of people and organizations
New Auto-Interp
Negative Logits
Frozen
-0.67
idle
-0.64
Shape
-0.63
netflix
-0.63
polar
-0.59
favour
-0.57
adesh
-0.56
tics
-0.54
Downloadha
-0.52
Rahul
-0.52
POSITIVE LOGITS
ONSORED
0.71
enson
0.63
ãĤ¼ãĤ¦ãĤ¹
0.60
heny
0.59
Jr
0.59
mann
0.58
nel
0.58
ertodd
0.57
Kills
0.57
eyes
0.57
Activations Density 0.437%