INDEX
Explanations
names of people or professions
questions and statements related to accountability and judgment
New Auto-Interp
Negative Logits
ARDIS
-0.69
geoning
-0.69
inel
-0.65
hement
-0.61
lesh
-0.61
MFT
-0.60
incapac
-0.59
=-=-
-0.58
imov
-0.57
abad
-0.57
POSITIVE LOGITS
doing
1.07
Doing
1.02
accomplished
0.98
done
0.97
entail
0.96
SourceFile
0.93
fuss
0.88
Saying
0.88
Done
0.87
entails
0.87
Activations Density 0.607%