INDEX
Explanations
prepositions of place
the concept of origins or sources
New Auto-Interp
Negative Logits
hor
-0.74
neutron
-0.74
fodder
-0.70
che
-0.69
horizon
-0.68
chew
-0.67
pockets
-0.66
lette
-0.64
arsen
-0.64
spaghetti
-0.64
POSITIVE LOGITS
}.
0.72
%).
0.70
Columb
0.70
OAD
0.69
Rev
0.68
ealous
0.68
Coliseum
0.68
Cause
0.67
Admission
0.66
Frazier
0.65
Activations Density 0.000%