INDEX
Explanations
proper nouns, specifically names of people
references to notable individuals, specifically filmmakers and politicians
New Auto-Interp
Negative Logits
Tune
-0.74
antioxid
-0.68
yang
-0.62
Duck
-0.61
XD
-0.61
Tayyip
-0.61
Erdogan
-0.60
Pok
-0.60
eering
-0.60
Takeru
-0.59
POSITIVE LOGITS
owship
1.29
ritch
0.92
ats
0.83
ows
0.83
nuts
0.80
Winged
0.77
atan
0.77
ecided
0.76
umbs
0.76
angel
0.74
Activations Density 0.032%