INDEX
Explanations
phrases that denote an exception or contrast
phrases indicating separation or distinction
New Auto-Interp
Negative Logits
crawl
-0.68
Huck
-0.65
tiss
-0.63
tumble
-0.62
$$
-0.60
IZE
-0.60
cat
-0.59
descent
-0.58
iking
-0.58
00000000
-0.57
POSITIVE LOGITS
heid
1.43
ments
1.27
comings
1.06
Ħ¢
0.90
osite
0.87
isphere
0.86
Apart
0.80
inguished
0.77
MENT
0.75
Side
0.75
Activations Density 0.006%