INDEX
Explanations
names of people
proper names, particularly related to notable individuals
New Auto-Interp
Negative Logits
GoldMagikarp
-0.85
ModLoader
-0.78
OURCE
-0.70
SPONSORED
-0.68
CONT
-0.66
REAM
-0.64
Discussion
-0.62
ILA
-0.62
CG
-0.61
FU
-0.61
POSITIVE LOGITS
sson
1.45
kson
1.22
ovich
1.13
owicz
1.06
ovsky
0.97
eson
0.96
son
0.95
opoulos
0.93
etti
0.92
acci
0.91
Activations Density 0.222%