INDEX
Explanations
verbs or phrases related to explaining or describing something
phrases and terms related to definitions and descriptions
New Auto-Interp
Negative Logits
maneu
-0.69
thrive
-0.67
archive
-0.62
bj
-0.62
win
-0.61
qus
-0.61
Whe
-0.60
ensued
-0.59
wr
-0.59
ath
-0.59
POSITIVE LOGITS
Franch
0.73
guiName
0.69
gdala
0.69
derog
0.68
figur
0.67
joking
0.66
isSpecialOrderable
0.65
lightly
0.64
=#
0.63
metaphor
0.61
Activations Density 0.429%