INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
raised
-0.87
locking
-0.83
mental
-0.81
translation
-0.79
assisted
-0.74
pressed
-0.74
shot
-0.73
fetched
-0.73
blown
-0.72
locks
-0.72
POSITIVE LOGITS
Journals
0.83
Vide
0.75
senal
0.71
VID
0.71
OFFIC
0.66
aloud
0.66
Spect
0.65
english
0.64
Volcano
0.64
Beaut
0.63
Activations Density 0.000%
No Known Activations
This feature has no known activations.