INDEX
Explanations
the word "is" within sentences
statements about likelihood or probability
New Auto-Interp
Negative Logits
ahime
-0.85
sidx
-0.80
wig
-0.77
Subject
-0.74
blind
-0.73
reen
-0.73
ware
-0.69
iciary
-0.67
HUD
-0.65
Instruments
-0.64
POSITIVE LOGITS
Ĭ±
0.69
ĻĤ
0.69
Extra
0.68
livest
0.67
ģĸ
0.67
ģ«
0.65
gon
0.63
ĺ
0.63
rolet
0.62
ĺħ
0.62
Activations Density 0.000%