INDEX
Explanations
email addresses with specific patterns
email addresses or contact information
New Auto-Interp
Negative Logits
Zombies
-0.85
Vi
-0.79
Borders
-0.77
Ruler
-0.76
Barrier
-0.73
Robot
-0.71
Clash
-0.70
Bunny
-0.69
Strike
-0.69
Elves
-0.69
POSITIVE LOGITS
@
1.38
_
1.04
info
1.00
podcast
0.95
pedia
0.94
reports
0.93
facts
0.90
archives
0.90
oft
0.89
odcast
0.89
Activations Density 0.206%