INDEX
Explanations
phrases related to actions or activities
instances of the word "out" in various contexts
New Auto-Interp
Negative Logits
pim
-0.76
ankles
-0.74
multim
-0.69
wallets
-0.68
averaging
-0.64
wrists
-0.64
heels
-0.64
lottery
-0.63
reckoning
-0.63
ears
-0.61
POSITIVE LOGITS
rey
1.02
inho
0.96
atis
0.88
ube
0.86
lov
0.86
hest
0.84
reau
0.83
cade
0.80
hee
0.80
ek
0.78
Activations Density 0.071%