INDEX
Explanations
words related to positions or locations
count nouns that indicate various categories or groups of entities
New Auto-Interp
Negative Logits
ocracy
-0.68
cca
-0.62
aeda
-0.60
igun
-0.58
Ashes
-0.57
ghazi
-0.56
Relief
-0.56
Narr
-0.55
Corpus
-0.55
imaru
-0.55
POSITIVE LOGITS
mith
0.85
sprang
0.78
logged
0.75
remain
0.73
dealt
0.72
remained
0.70
yielded
0.70
were
0.69
rescind
0.68
have
0.68
Activations Density 0.303%