INDEX
Explanations
the word "is" in sentences
affirmative statements or declarations
New Auto-Interp
Negative Logits
ethy
-0.82
OTO
-0.76
morph
-0.74
Defin
-0.74
actionGroup
-0.73
nesia
-0.73
orph
-0.72
oglu
-0.71
dinand
-0.70
ijn
-0.69
POSITIVE LOGITS
Weather
0.70
Deal
0.66
kindly
0.62
Wen
0.61
!]
0.61
unbiased
0.59
lowly
0.59
aven
0.58
Grimm
0.58
ratio
0.56
Activations Density 0.000%