INDEX
Explanations
phrases or questions involving the concept of knowledge or understanding
the phrase "know how."
New Auto-Interp
Negative Logits
ceptions
-0.71
UME
-0.69
iculture
-0.69
agonists
-0.67
idered
-0.65
odder
-0.64
........
-0.62
holder
-0.61
izu
-0.61
Reader
-0.61
POSITIVE LOGITS
beit
0.80
HCR
0.79
much
0.78
MUCH
0.73
itzer
0.73
ells
0.69
soever
0.69
ling
0.69
ever
0.68
much
0.67
Activations Density 0.068%