INDEX
Explanations
references to the concept of something being "in" a certain state or condition
the preposition "in" across various contexts
New Auto-Interp
Negative Logits
alike
-0.71
distingu
-0.63
inclined
-0.62
suscept
-0.58
overcame
-0.57
convol
-0.57
obedient
-0.56
absorbs
-0.55
cia
-0.55
behaves
-0.55
POSITIVE LOGITS
escap
1.35
humane
1.13
versely
1.01
jeopardy
1.01
fact
1.00
authent
0.98
ked
0.97
built
0.94
bred
0.93
lined
0.93
Activations Density 0.141%