INDEX
Explanations
mentions of "Computer Science" and related terms
New Auto-Interp
Negative Logits
eca
-0.80
uality
-0.75
abouts
-0.72
sweet
-0.70
ModLoader
-0.70
Rib
-0.67
ctr
-0.66
IU
-0.66
ï¸ı
-0.66
eous
-0.65
POSITIVE LOGITS
ized
0.94
readable
0.90
terminals
0.88
science
0.86
izable
0.86
interfaces
0.83
izes
0.81
anical
0.79
generated
0.78
isation
0.78
Activations Density 0.026%