INDEX
Explanations
URLs related to GitHub repositories
GitHub repositories
github links
New Auto-Interp
Negative Logits
<<<<<<<<<<<<<<
-0.72
Ladd
-0.68
ondes
-0.66
Brod
-0.66
Kach
-0.66
везе
-0.66
Unwin
-0.65
Xavi
-0.65
peed
-0.65
Rukh
-0.64
POSITIVE LOGITS
github
1.17
Github
0.91
github
0.86
Github
0.86
ITHUB
0.84
repository
0.83
GitHub
0.82
Repository
0.82
GitHub
0.78
UserRepository
0.77
Activations Density 0.040%