INDEX
Explanations
fictional characters and popular media
New Auto-Interp
Negative Logits
女優
0.70
obscene
0.66
बसों
0.65
ರಾಷ್ಟ
0.64
窪
0.64
合い
0.64
opathie
0.63
椙
0.63
charts
0.62
<unused1835>
0.62
POSITIVE LOGITS
Vegeta
1.01
Goku
1.00
Arag
0.99
Arya
0.98
Batman
0.98
Luffy
0.96
Eren
0.96
Luke
0.96
Zoro
0.95
olverine
0.94
Activations Density 0.136%