INDEX
Explanations
references to children's literature and educational activities
Follows a question
questions about consequences
New Auto-Interp
Negative Logits
zijne
-0.62
rungsseite
-0.60
TargetException
-0.59
betreft
-0.59
kasarigan
-0.59
complexContent
-0.58
RegressionTest
-0.57
שוליים
-0.56
featureID
-0.56
SharedDtor
-0.55
POSITIVE LOGITS
school
0.43
enz
0.42
friendship
0.41
COMPR
0.39
robot
0.39
yummy
0.39
smells
0.39
living
0.39
doctor
0.38
boys
0.38
Activations Density 0.175%