INDEX
Explanations
specific instances where understanding or knowledge is explicitly mentioned
instances of the word "understands."
New Auto-Interp
Negative Logits
onies
-0.86
ads
-0.74
downed
-0.73
collection
-0.72
cial
-0.67
haps
-0.65
pmwiki
-0.64
ammy
-0.63
rav
-0.63
ogue
-0.63
POSITIVE LOGITS
understands
0.96
¿½
0.89
understood
0.80
Understand
0.78
ãĤ©
0.78
SERV
0.77
ledged
0.73
terday
0.72
Trident
0.70
uitive
0.69
Activations Density 0.003%