INDEX
Explanations
concepts related to being trapped, buried, or enclosed
New Auto-Interp
Negative Logits
يتيمه
-0.95
Efq
-0.78
pleaſure
-0.77
perſon
-0.76
raiſ
-0.74
purpoſe
-0.72
perature
-0.72
ſta
-0.71
reaſon
-0.71
ArrowToggle
-0.70
POSITIVE LOGITS
within
0.70
behind
0.65
inside
0.63
firmly
0.57
between
0.56
in
0.56
beneath
0.55
tightly
0.55
amongst
0.55
among
0.54
Activations Density 0.371%