INDEX
Explanations
phrases related to knowledge or awareness
pronouns and their usage in context, particularly focusing on the concept of knowledge and needs
New Auto-Interp
Negative Logits
uca
-0.76
hua
-0.74
oother
-0.73
arty
-0.69
intendent
-0.68
eka
-0.68
OGR
-0.66
rition
-0.66
cffffcc
-0.66
uci
-0.65
POSITIVE LOGITS
imaginable
1.16
except
1.10
except
1.05
ses
0.75
pires
0.74
hoop
0.72
including
0.72
Including
0.71
conceivable
0.71
nut
0.70
Activations Density 0.343%