INDEX
Explanations
mentions of the name "Mos" at different activation levels
mentions of the name "Mos" or variations thereof
New Auto-Interp
Negative Logits
ACTED
-0.75
istical
-0.73
DragonMagazine
-0.72
ordinate
-0.72
ISM
-0.70
STEP
-0.69
Accessory
-0.69
istically
-0.68
sburgh
-0.67
istics
-0.66
POSITIVE LOGITS
illion
0.88
fet
0.83
ques
0.83
cos
0.83
bat
0.82
Mos
0.82
cone
0.81
quit
0.80
hin
0.79
erver
0.79
Activations Density 0.005%