INDEX
Explanations
verbs related to positioning or adjusting something in a particular way
New Auto-Interp
Negative Logits
loo
-0.89
ilk
-0.85
perty
-0.72
hell
-0.72
bara
-0.71
gery
-0.69
HAEL
-0.68
geoning
-0.67
sein
-0.66
scene
-0.65
POSITIVE LOGITS
ments
1.21
eering
0.95
arity
0.91
aligned
0.87
align
0.85
alignment
0.85
iances
0.84
aligned
0.81
icut
0.79
inates
0.78
Activations Density 0.032%