INDEX
Explanations
instances where a particular level of sufficiency or capability is met
New Auto-Interp
Negative Logits
Mens
-0.67
kind
-0.66
Gh
-0.66
bull
-0.65
Guth
-0.65
raine
-0.60
Braz
-0.59
hov
-0.59
tell
-0.58
misc
-0.57
POSITIVE LOGITS
hots
0.74
to
0.66
rehend
0.63
ITIES
0.63
quantities
0.63
externalToEVAOnly
0.62
ioned
0.60
tones
0.60
indeed
0.59
onductor
0.59
Activations Density 0.030%