INDEX
Explanations
mentions of instruction manuals or guides
references to "manuals" or instructions
New Auto-Interp
Negative Logits
ymes
-0.80
nces
-0.77
ãĤ¤ãĥĪ
-0.75
ween
-0.71
addons
-0.69
encers
-0.69
èª
-0.69
orkshire
-0.68
adding
-0.67
mary
-0.67
POSITIVE LOGITS
dexterity
1.00
typew
0.93
manual
0.88
transmission
0.87
exha
0.85
urally
0.78
override
0.75
induction
0.74
destro
0.74
©¶æ
0.72
Activations Density 0.007%