INDEX
Explanations
words related to language, especially verbs indicating action and concepts related to language
concepts related to language development and its dynamic nature
New Auto-Interp
Negative Logits
iens
-0.67
Stir
-0.63
ovember
-0.63
@
-0.62
âĵĺ
-0.62
isode
-0.60
Liter
-0.59
iasis
-0.59
ourn
-0.58
inburgh
-0.57
POSITIVE LOGITS
outweigh
0.71
inherently
0.68
anyway
0.68
poorly
0.68
anyways
0.67
crappy
0.65
tariff
0.65
finite
0.64
subsidized
0.64
unreliable
0.64
Activations Density 1.760%