INDEX
Explanations
phrases indicating something obvious or well-known
the phrase "of course" in various contexts
New Auto-Interp
Negative Logits
iry
-0.81
rouse
-0.73
idas
-0.68
bish
-0.67
erd
-0.66
Fram
-0.63
idy
-0.62
ment
-0.58
RANT
-0.58
mented
-0.58
POSITIVE LOGITS
س
0.79
NULL
0.74
ña
0.74
onga
0.67
olkien
0.64
ãĥķãĤ©
0.63
forth
0.60
hesda
0.59
nit
0.59
spoil
0.58
Activations Density 0.028%