INDEX
Explanations
phrases related to knowledge and understanding
instances of awareness or understanding
New Auto-Interp
Negative Logits
cohol
-0.73
atre
-0.73
ermanent
-0.71
clusive
-0.70
bury
-0.70
acco
-0.68
aredevil
-0.68
vity
-0.68
elight
-0.68
ItemTracker
-0.68
POSITIVE LOGITS
ledged
1.14
instinctively
1.07
how
1.04
ledge
0.99
firsthand
0.97
nothing
0.96
exactly
0.85
darn
0.84
what
0.83
nothing
0.83
Activations Density 0.066%