INDEX
Explanations
the word "specific" in various forms and contexts
New Auto-Interp
Negative Logits
ร
-0.17
our
-0.16
ekler
-0.15
hood
-0.15
orous
-0.14
yll
-0.14
ony
-0.14
hra
-0.14
Holmes
-0.14
ipo
-0.13
POSITIVE LOGITS
ities
0.30
ally
0.25
ially
0.24
ity
0.24
-purpose
0.23
ALLY
0.22
/general
0.19
ITY
0.17
ITIES
0.16
ités
0.16
Activations Density 0.037%