INDEX
Explanations
references to the term "ides"
occurrences of the word "ides."
New Auto-Interp
Negative Logits
STER
-0.68
iona
-0.67
thening
-0.67
shaw
-0.64
gag
-0.63
Kard
-0.61
brance
-0.60
restraint
-0.59
pron
-0.58
enegger
-0.58
POSITIVE LOGITS
ides
1.33
IDES
1.22
ide
1.17
IDE
1.09
creen
0.91
ktop
0.84
llah
0.82
bum
0.81
hare
0.78
maid
0.78
Activations Density 0.009%