INDEX
Explanations
information and knowledge-related terms, such as 'information', 'knowledge', 'facts', and 'advice'
New Auto-Interp
Negative Logits
alone
-0.71
Parables
-0.66
igham
-0.65
awar
-0.64
rine
-0.64
gg
-0.62
kus
-0.62
mini
-0.61
neys
-0.61
atos
-0.60
POSITIVE LOGITS
glean
1.08
arial
0.96
retrieval
0.95
overload
0.94
gathered
0.94
regarding
0.92
pertaining
0.89
about
0.87
theoret
0.86
gathering
0.84
Activations Density 0.471%